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The Psychological Corporation’s Index of Public Opinion 


Henry C. Link 
The Psychological Corporation, New York City 


This is the thirteenth in this series of surveys begun in 1937 as experi- 
mental studies in social psychology. It was made during the first three 
weeks in October with 5,000 personal interviews in 123 cities and towns 
representing a cross-section of the urban and small town population. 
The interviews were made by 459 interviewers under the direction of 130 
psychologists. Two questionnaires were used, each with one-half the 
sample, so that some questions were asked of 5,000 people and others of 
only 2,500. The number of interviews for each question is given in the 
tables. All interviews were made in the home, but only one in a family. 
Half were made with women, half with men. 

The interviews were distributed by four socio-economic groups re- 
ferred to in the following tables as A, B,C,and D. This distribution was 
made in accordance with the socio-economic maps in each locality accord- 
ing to which the local supervising psychologist assigned the calls to be 
made by streets and blocks. The great differences between thinking of 
these various socio-economic groups are shown in some of the tables. 
These differences, incidentally, are also an indication of the thoroughness 
with which these interviews have been distributed by socio-economic 
levels. 

Some of the questions in this survey were asked for the first time. 
Others are questions which have been asked in as many as seven or eight 
earlier studies. Where questions have been repeated, the results of a 
few of the earlier studies are included. Some of these questions were 
asked also of 1,000 college students in colleges throughout the country. 
The results are included in the tables. 


The Trend Toward Socialism 


The greatest and most obvious trend throughout the world today is 
the trend toward somplete Government control of business, variously 
referred to as Socialism, Collectivism, Totalitarianism, Dictatorship, 
Fascism, or Communism. Does the American public think that this 
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2 Henry C. Link 
country is also headed for Socialism? Are they ready for Socialism? 
Here are a few straws in the wind. 


. “Did you know that the English eS elected the Socialist-Labor 
party last July?” (Asked of half the sample 


or 2,500.) 
Socio-Econ. Groups Coll. 
Answers Total A B C D Studs. 
% See, Se eC % 
Octane aiid Re aK sans 04m 72 92 84 70 49 86 
aaa 4 chdioas Fone kcuuseen 28 8 16 30 51 14 
Total Interviews........... 2500 250 750 1000 6500 1000 


This table illustrates the extreme differences in socio-economic groups 
on the matter of knowledge. Whereas 8 per cent in the A group said they 
did not know about the English elections, 51 per cent of the D group 
professed ignorance. The results of the next question also show inter- 
esting differences but in a quite different direction. 


. “Do you think that this means that the U. S. is also going toward 
Socialism?’’ 


Socio-Econ. Groups Coll. 

Answers Total A B C D Studs. 
% S. Race RR % 
int +6 hives oo'achihed aes uae 27 35 32 26 19 28 
Ria a 6 6 wha aie Oia oe 53 56 57 53 46 62 
PRs 5 Sige ovis 4d guna 20 9 11 21 35 10 
Total Interviews........... 2500 250 750 1000 500 1000 


This table shows differences between socio-economic groups on a 
matter of opinion. According to these results, the higher the socio- 
economic status, the higher the belief that the United States is or is not 
headed for Socialism. This strange result may represent a conflict be- 
tween wishful and realistic thinking. Our studies in other connections 
have shown that a great many people, especially in the C and D groups, 
representing the large population of industrial workers, do not know what 
Socialism is, and do not know what Capitalism is either. 

The following two questions were asked in the reverse order in one- 
half the interviews. There was no significant difference in the results, 
and, therefore, the results have been combined. 


Q. “Is it good for America in peacetime for the Government to set top 
prices which stores and factories may charge for their goods?” 


Socio-Econ. Groups Coll. 
Answers Total A B C D Studs. 
% Se Ss % 
atts ain Vinee a 0 ob «a aD 51 36 46 53 61 46 
Ss nibs ahb se nea's amas sake 43 61 49 At 29 51 
SUNS Sid'vscs cs coke t tus 6 3 5 6 10 3 


Total Interviews........... 5000 500 1500 2000 1000 1000 
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Q. “Is it good for America in peacetime for the Government to set top 
limits for workers’ wages and salaries?” 


Socio-Econ. Groups Coll. 
Answers Total A B C D Studs. 
% > > * F % 
Bs & 00's bie ws 0.ad0 ee bo eeNe es 39 28 34 41 51 38 
DU gu bs cots ocgatdccabee San 52 67 59 51 36 60 
SING... cic b dure 0's 640 bw 9 5 7 8 13 2 
Total Interviews........... 5000 500 1500 2000 1000 1000 


Here the variations between socio-economic groups are not only con- 
siderable, but consistent. The higher the socio-economic group, the 
higher the rejection of price and wage controls, and vice versa. Regard- 
less of status, a substantial proportion of the urban population approves 
wage controls and, even more, price controls. Obviously, insofar as these 
controls are imposed by the Government on industry, Socialism is sub- 


stituted for a free economy. 
Take-Home Pay and Shorter Hours 
That the American public is not completely unrealistic _n its thinking 
is shown by the answers to the following questions: 


. “Ifa man was paid $50 a week for 48 hours work in wartime and he is 
now working only 40 hours a week, should he still be paid $50?” 


Socio-Econ. Groups Coll. 

Answers Total A B Ci D Studs. 
% GS Hu BW % 
Cnt sithe By Aan 0% 66 0% 42 29 38 45 52 30 
i i aaah oa dick dela e 50 65 56 47 39 66 
Ber S BOW. isis Cad oe ei oa Ss 6 6 8 9 4 
Total Interviews........... 2500 500 750 1000 £500 1000 


These results represent the effects of considerable wishful thinking. 
It would have been interesting, from the research point of view, to see 
what these answers would have been if we had used the sum, $80, instead 
of $50 in our question. We should also have asked: “Do you think that 


such increases can be made generally without a corresponding increase in 
prices?” 


Take-Home Pay and the Cost of Living 


In view of the controversy over the effects of reconversion on wages in 
reference to the cost of living, the testimony of people themselves as to 
their present status is unusually interesting. Because this question was 
asked in a number of earlier studies proper comparisons of the present 
with the past can be made. The question asked was as follows: 
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Q. “Is your family more prosperous (or better off) today than two years 
ago, less prosperous, or the same?’ 


Oct. Oct. Apr. Oct. Coll. 


Answers 1941 1943 1945 1945 Studs. 
% % % % % 
More prosperous. ................ 38 29 28 32 31 
About the same................. 47 46 48 51 55 
Less prosperous.................. 15 23 21 15 12 
NL Ss oa. od 50 kode wee mee — 2 3 2 2 
Total Interviews............... 2000 2500 2500 2500 1000 


Therefore, in spite of the abrupt termination of many war industries, 
and the wholesale changeover from wartime to peacetime jobs, an even 
greater majority of people, 83 per cent, claim that they are as prosperous 
or more prosperous than they were two years ago. Moreover, this pros- 
perity extends pretty evenly through all socio-economic groups as shown 
in the following table. 

Socio-Econ. Groups 
B C 


Oct. 1945 Total A 
% % % % % 
More prosperous................... 32 32 31 29 39 
ET UME. osc cc ccceccecheoss 51 50 53 53 47 
Less prosperous.................... 15 16 15 15 12 
NG hk o's hi wae aon ceuvues sans 2 2 1 3 2 
Total Interviews................. 2500 250 750 1000 500 


The Public’s Spending Plans 
The current high prosperity of the public is further demonstrated by 
the replies to the question: 
Q. “How much of the money you have saved since the war do you expect 


to use within a year or two after the war stops—all of it, two-thirds of it, one- 
third of it, or none of it?” 


Of all respondents, 21 per cent said that they were uncertain, while 8 
per cent said they had no savings. The remainder, as compared with 
those who answered the same question in October 1944, answered as 
follows: 


Oct. Oct. 
1944 1945 
% % 
48 56 planned not to spend any savings 
24 21 planned to spend one-third 
15 9 planned to spend two-thirds 
13 14 planned to spend all their savings 


This and the next question were asked after we had first talked with 
people about their buying plans and after they had enumerated the things 
they were planning to buy. After they had stated their buying inten- 
tions, we asked: 
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. “Do you intend to pay for these things out of your current earnings, 
or by using the cash you have in the bank, or by cashing in your war bonds, or 
by buying them on the installment plan?” 


Answers % 
SOUND. ce sc cc ccccetsces 45 
a ian a ib weive ceils os 32 
Installment plan................... 23 
a i on 8 
oh eR lh RE de pl Me 10 
_ | TS eens ee 118* 


* Per cents add to more than 100 because some people gave two or more answers. 


The Returning Veterans 


The rapidity with which service men and women are returning and 
adjusting themselves to civilian life and peace jobs is indicated by the 
answers to the following questions: 

Q. “How many members of your family (living at home) have been in the 


armed services? How many have come home for good? How many of these 
now have jobs?” 


No. % 
Families who had 1 or more members 
in the armed Services... .......... cece: 1004 40 
None in the armed services............... 1496 60 
EES Sy EL Pe Se 2500 


These results show how extensively demands of war affected the 
families of the United States, at least the urban population. The 1,004, 
or 40 per cent, of families who had one or more members in the armed 
services had a total of 1,514 members in the armed services. Of this 
number 459, or 30 per cent, have come home for good. Of those who have 
come home for good, 65 per cent have jobs. 


Who is Doing the Best Job of Reconversion 


In view of the sharp controversies over the reconversion from war jobs 
to peace jobs, the following question was thought timely and was asked 
in half our sample, or 2,500 families: 


Q. “How do you think that the changeover from war jobs to peace jobs 
is being handled up to now? For instance, do you think the Government has 
done a good, fair, or poor job? How about the Army—good, fair, or poor? 
The Navy? The Labor Unions? Employers or businessmen?” 


Agencies Good Fair Poor  D.K. 
% % % % 
Te CAGWOTEMGMS. ... 0. cae cccces 35 33 20 12 
TN oss thee < nbd véesee 50 23 13 14 
TS RG Pee 49 21 10 20 
The Labor Unions................. 16 16 54 14 


Employers or businessmen.......... 40 30 14 16 
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Evidently, the Army and Navy rate very well in the eyes of the public 
in spite of the many criticisms that have been levelled against them for 
holding up demobilization. The Government and employers rate about 
equally well, but the Labor Unions are rated as having done a poor job 
by a conspicuous majority. 

In the other half of our sample we repeated a question which has been 
used periodically since October 1941, and which represents reconversion 
in its broadest phases. This question was: 

Who do = think can do the best job in So eedanes Labor Uas out 


after the war: the Government in Washington; Business Leaders; nion 
Leaders; or others?” 


Oct. Oct. Apr. Oct. Coll. 


Answers 1941 1943 1945 1945 Studs. 

% % % % % 
Gov’t in Washington............. 47 42 46 51 62 
DMMOND LARGE... . ww cc ceee 26 28 21 22 21 
Labor Union Leaders............. 5 8 8 9 4 
All three together................ Ws sic i) 11 12 10 
Others or had no opinion.......... 17 17 17 11 7 
Total Interviews............... 2000 2500 2500 2500 1000 


The above per cents add up to more than 100 because some people 
named two agencies. Reliance on the Government is at a high point. 
The results by socio-economic groups are especially significant, as may 
be seen from the following table: 

Porte. pen Spore 


Answers A D 

% % % % 

Gov’t in Washington. ...................25. 43 46 55 56 
es Laneea8. Lice vs 095 dk cces ema’ 36 29 17 12 
Rae SEN BMRMOUD «6.0 4 50% vue dcempe cone 3 6 10 15 
ET UE nc ncbesccssvcécpactboptece 19 15 12 7 
Others or had no opinion................... 7 10 10 15 
TOR IOS « genio ase kt Ce thks ae 250 750 1000 500 


Unemployment Compensation 


In view of the controversy over the desirability of extending the 
periods and amounts of unemployment compensation, the following 
question was asked: 


. “Do you think that unemployment insurance is keeping many people 
who have ad laid off from taking new peacetime jobs?” 


Socio-Econ. Groups Coll. 

Answers Total A B Cc D Studs. 
% > 2». o> % 
MDG c's 00 ei be ons cates cawna 52 65 60 51 39 43 
SS iki ota o MEN o cb ve eeee ob es 36 26 32 38 42 49 
ce eee Ses eee 12 9 8 11 19 8 


Total Interviews........... 2500 250 750 1000 500 1000 
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Although sharp differences are shown by socio-economic groups, a 
large proportion of people even in the large C and D industrial groups say 
that they believe unemployment insurance at present is keeping many 
people from taking new peactime jobs. 


Public Postwar Optimism 


The sharp swing toward greater optimism in postwar prospects which 
was shown by the April 1945 survey was maintained in the present survey 
and, in some respects, even heightened. This is particularly true in 
respect to wages. The questions and the results were as follows: 


“During the next year or two do you think that the people of this 
country will be better or worse off than they are now?” 


Oct. Oct. Apr. Apr. Oct. Coll. 


Answers 1941 1943 1944 1945 1945 Studs. 
% % % % % % 
Will be better off............. 18 48 31 55 49 42 
_  . 4  . SR eos 69 32 47 27 32 47 
I cs ow ke ee be 18 20 22 18 19 11 
Total Interviews............ 2000 2500 2500 2500 2500 #1000 


Q. “How about jobs; will there be more, fewer, about the same?” 


Oct. Oct. Apr. Apr. Oct. Coll. 
Answers 1941 1943 1944 1945 1945 Studs. 


Jo Jo J To 7% Jo 
26 


Will be more jobs............. 8 22 34 35 21 
About the same.............. 11 20 17 22 26 28 
Will be fewer jobs............ 74 46 51 38 33 49 
POU: get iues tes tack nates 7 8 10 6 6 2 


Q. “Will wages be higher, about the same, or lower?” 
Oct. Oct. Apr. Apr. Oct. Coll. 
Answers 1941 1943 1944 1945 1945 Studs. 
% % % % % % 
8 


SE IN, a os wien meececin 10 6 10 25 11 
Will be about same........... 20 26 26 35 34 32 
,. ff} aan er Bee 60 60 60 51 36 55 
ad 6 ann as bck pa wrnens 10 6 8 4 5 2 


Q. “Will our Government be less democratic, more democratic, or the 
same as now?” 


Oct. Oct. Apr. Apr. Oct. Coll. 

Answers 1941 1943 1944 1945 1945 Studs. 
% % % % % % 
Less democratic.............. 26 19 17 22 15 19 
ES Ee: ar 33 34 31 37 45 45 
More democratic............. 19 30 27 28 27 30 
BPO ss Ws coc. dived os 22 17 25 13 13 6 
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The Public Predicts the Next War 


In view of the tremendous interest in peace and measures for a per- 
manent peace, we repeated a question which we asked first in a depth 
study in February 1943 (Link, H. C., An experiment in depth interview- 
ing on the issue of internationalism vs. isolationism, Pub. Opin. Quart., 
1943, 6, 267-279). The question was as follows: 


. “After this war, do Ba: think that we will make a peace settlement 
that will last, or do you think that we will have another world war in twenty- 


five years or so?” 
Feb. Oct. Apr. Oct. Coll. 


Answers 1943 1944 1945 1945 Studs. 
% % % % % 
Will have another war............ 43 54 51 59 71 
Will make a lasting peace......... 47 28 33 28 22 
SE IS @ 5 0 ahiehdctic colenin © sue 10 18 16 13 7 
Total Interviews................ 200 2500 2500 2500 1000 


Q. “Who do you think will be our next enemy?” 
Answers by Those Who Said There Would be Another War 


Oct. Apr. Oct. Oct. Apr. Oct. 

1944 1945 1945 1944 1945 1945 
% % % % % % 
Russia........ 29 27 37 England...... 4 4 3 
Germany...... 9 6 2 SE ss bs & o's 1 1 1 
CN bb diss 5 3 5 Don’t know.... 6 10 11 


PAU s ondaite «6 dn ohn dos paige 64.00 epndesasesedhs 54 51 59 


This reflects a steady and sharp increase in the percent who expect 
another war, except for the April 1945 period which reflected the result 
of the San Francisco Conference. There is a sharp increase in those who 
believe that the next war will be with Russia. Of the 59 per cent who 
anticipate another war, about 70 per cent name Russia as the next foe. 


Travel by Airplane 


In view of the tremendous development of aviation through the war 


and its bright prospects after the war, we repeated a question asked in 
1943, namely: 


Q. “Have you ever taken a trip in an airplane? (If Yes) How many?” 


Oct. Oct. 

Answers 1943 1945 
% % 
Eh a 4 ech be en och cateieas 28 31 
Rae a 6 6 cle be vo Soe chee tales 72 69 


Total Interviews.............. 2500 2500 
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We then asked the following question in the October 1945 study to 
obtain some measure of the extent to which some people were planning to 
travel by air in the near future, namely: 


Q. “Are you planning to take such a trip in the near future, say within a 


year?” 
Answers 1045 
eRe ach Koei ema ek 4 p-dirs os a 
Beat enedés vi ccs lycvcescclsialod® 
Total Interviews................ 2500 


Since this question was not asked before, a comparison is not possible. 
Nevertheless, it indicates that a very large number of people are planning 
to travel by air shortly. 


Received November 23, 1945. 
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A Color Aptitude Test, 1940 Experimental Edition 


Forrest Lee Dimmick * 
Hobart College 


For a long time there has been a need in many industries for a means 
of determining the suitability of workers for their jobs in fields which re- 
quire the matching of colors. Expressions of this need have appeared as 
a rule in house organs or in reports to technical associations.' One at- 
tempt at least has been made to deal directly with the problem, but while 
it pointed the way, its materials are not available for general use.” 

At the annual meeting of the Inter-Society Color Council in February 
1939, in a session given over to discussion, the problem was formulated 
and a call made for assistance. Dyers of silk textiles, for example, are 
given a swatch of cloth or a skein of yarn which must be duplicated. 
They must decide the dyes to be used and the precise amount of each 
component that is required, by successive trial dyeings. Hence, dyers 
must make quick and accurate judgments of color. While extreme de- 
ficiencies in color vision are not likely to go undetected for long, it is 
obvious that an industry does not wish to train a new worker in the use 
of dyes only to find that he cannot discriminate between them. On the 
other hand anomalies less than “color blindness’ cause serious difficul- 
ties because they go unnoted until some glaring error is committed. An 
example of such a case is a dyer who, in a series of dyeings of a particular 
red, made every successive match slightly too yellow with the result that 
the final dyeings had to be rejected. Of equal importance is a sheer 
ineptness with colors that may appear because aptitude for making 
matches is often the very last consideration that is taken into account in 
selecting apprentice dyers. 

As a result of its discussion, the Color Council directed its Problem 
Committee to undertake the development of a test or a series of tests 
that would answer the needs of various industries and of other fields where 
color matching plays an important role.* 

* Co-Chairman, Inter-Society Color Council Committee on Problem 10, Color Apti- 
tude Test. 

1 American Pulp and Paper Association. Color blindness, Report No. 34, 1941. 

2W.0O.D. Pierce. The selection of color workers, London, Pitman, 1934. 

*In order to make clear why such a problem could be undertaken with some con- 


fidence, the organization of the Inter-Society Color Council should be understood. 
The Council consists primarily of the representatives of thirteen national organizations, 
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When we attempted to formulate a test procedure, it was evident 
that some kind of matching technique was indicated, in which a degree 
of manipulation plays a part. Two methods came up for final considera- 
tion: 1. The arrangement of finely graded series in proper order was pro- 


- _ NP 
3 ; 


NS 








Fic. 1. Section of the I.C.I. diagram showing specifications of the Aptitude Test 
material. The dotted line represents the yellowish-red series, the dot-dash line the 
bluish-red series. 


posed since it called for fine discrimination; and 2, the matching of 
individual color samples suggested itself as psychologically basic. 

The projected test needed to be something more than an evaluation 
of color discrimination in the sense of a measure of some retinal capacity. 


namely: American Artists Professional League; Amer. Asso. of Textile Chemists and 
Colorists; Am. Ceramic Soc.; Am. Psychological Assoc.; Am. Soc. for Testing Materials; 
Federation of Paint and Varnish Production Clubs; Illuminating Engin. Soc.; Am. 
Pharmaceutical Assoc.; Optical Soc. of Am.; Soc. of Motion Pict. Engin.; Tech. Asso. 
of the Pulp and Paper Industry; Textile Color Card Assoc. of the U. 8.; U. 8. Pharma- 
copoeial Convention. Not only do these groups bring to the Council their numerous 
problems in color, but they offer a wealth of technical and practical information. Thus, 
the color test upon which we are working has required a technical skill in preparation 
of materials that would have been prohibitively expensive on an individual basis, if 
available at all. In addition, diverse industries in which color matching plays an 
important part are open for standardization purposes, on populations in which rela- 
tively high color skills have been demonstrated. The committee appointed by the 
L8.C.C. to develop the test is constituted as follows: Co-Chairmen, Forrest L. Dimmick, 
A.P.A., and Cari E. Foss, A.S.T.M.; I. A. Balinkin, A.C.S.; C. Z. Draves, A.A.T.C.C.; 
W. C. Granville, O.8.A.; J. P. Guilford, A.P.A.; Le Grand Hardy, I.M.; H. Helson, 
A.P.A.; D. B. Judd, O.8.A.; N. Macbeth, I.E.S.; Elsie Murray, A.P.A.; 8. M. Newhall, 
A.P.A.; D. Nickerson, 0.8.A.; J. L. Parsons, T.A.P.P.I.; L. Sloan, A.P.A.; A. H. Taylor, 
L.E.S.; M. J. Zigler, A.P.A. 
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“Color aptitude” involves the utilization of discriminative ability with 
greater or less efficiency. From the beginning, we realized that the factor 
of color-blindness must be taken into account, though it is only a minor 
part of our problem. In order to deal, therefore, with color aptitude and 
color blindness in the same test, we decided to work with colors that have 


























42 % 
mE f 
" “d 
Jie 46 
af ve 
32 % 
4 52. 
2 A 
24 b 
se § 
a 
« E 
s2s 
wv 67 
$s 66 
2 
a iV \ s4 
$n ‘S 70 
fo ‘\ 72 
2 ‘\ 77 
\ 
- ~~ 76 
4 re 7¢ 
2 ‘a %0 
| ae 4 
o Tis VTS6OT7YrVIVHARSAAH' 47 a 


Per Cent Purity 


Fic. 2. Per cent colorimetric purity of the two series. The solid line represents the 
yellowish-red series, 1-40, the dotted line, the bluish-red series, 41-80. 


been shown to lie in the neutral zones for so called “deuteranopia” and 
“‘protantopia.” We chose for the initial tests two reds, a bluish red 
(Munsell 6 RP/5 (A500c) and a yellowish red (Munsell 5 R/5.4 } 612). 
We proposed to make a 40-step saturation series from each of these colors 
to neutral gray (Munsell N/5). It was much easier to propose such 
materials than to produce them. The. Munsell Color Company had 
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found it practicable to make up their Book of Color with swatches vary- 
ing by two units each, while we were proposing to divide every one of 
these units into 10 steps, i.e. 20 to their one. Mr. Foss and Mr. Gran- 
ville who undertook the project for the committee accomplished the job 
with remarkable success. 
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Per Cent Reflectance 


Fic. 3. Per cent reflectance of the two series of chips. The solid line represents the 
yellowish-red series, 1-40, the dotted line, the bluish-red series, 41-80. 


Figure 1 shows the location of the color series on the I.C.I. chromatic- 
ity diagram. From it can be seen the degree to which they parallel lines 
of constant dominant wave length. The maximum deviation in the 
yellow-red series is effectively in the region of 3my, and in the purple-red 
series 10 mu. Figure 2 gives the change in per cent purity within the 
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series. Both series approximate straight lines thus indicating that we 
have nearly equal steps of colorometric purity. Figure 3 is a similar 
plot of the per cent reflectance of the chips in both series. That of the 
bluish-red series shows constant reflectance plus or minus an extreme of 
0.3 per cent. The yellowish-red series which was chosen to lie in the 
neutral zone for protantopes was made to increase by 3 per cent in reflect- 
ance from the neutral point to its saturated end to compensate for the 
darkening of the spectrum at the red end for such subjects. 


Procedure 


The first procedure to be tried consisted of presenting the subject with 
the 80 chips in a haphazard arrangement on a gray background under 
daylight illumination of 50 foot candles and requiring him to put the 
series in correct order. Surprisingly, the problem proved much too easy. 
A number of subjects made the arrangement without error on the first 
trial, therefore, while we could use it to screen out markedly deficient 
subjects, it gave no differentiation among color competent subjects. 
The failure of this procedure is psychologically interesting because it 
points out that “discrimination” is not a single, simple concept.‘ In 
some psychophysical experiments which we carried out with the series of 
chips, we found that for most 0’s the J. N. D.* lies between the first and 
second chips away from any given standard. Apparently then, the serial 
arrangement includes factors in addition to “discrimination” of each 
chip from its two neighbors. A casual examination of the perceptua] 
problem indicates that an inversion in the series reveals itself to a subject 
on the basis of at least 3 discrimination judgments some of which are be- 
tween definitely supraliminal stimuli. Thus the inversion of a pair of 
chips gives the perception of a “hump” in the series. 

Our next procedure consisted of laying out one set of chips in a pre- 
determined haphazard order and having the subject match a second set 
to the first. The task is somewhat more difficult than the first procedure. 
Scores in terms of errors in matching spread out over a greater range, but 
it remained possible for some subjects to make perfect sets of matches. 
This was due in some measure to the gradual elimination of choices as 
successive chips were filled in and to the fact that the subjects, necessa- 
rily, were permitted to rearrange the matches until they obtained a 
satisfactory total result. 

The final procedure is to require a subject to find in the haphazardly 


‘K. Koffka, Perception: An introduction to the Gestalt-Theorie Psychol. Bull., 


1922, 19, 540 ff. 
'F. B. Titchener, Experimental psychology. New York: Macmillan, 1927, Vol. II, 


part 1, p. 59. 
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arranged field a match for a single chip at a time. When found, a match 
is recorded and the chip is laid aside, so that the range of choices remains 
constant throughout the test. In spite of the fact that it was not possi- 
ble to control several factors in the administration of the Preliminary 
Sets, no subject made a perfect score on his first trial. With this pro- 
cedure we obtained 65 scores based upon 80 judgments. Analysis of 
them gives us indications of several revisions that must be made. Figure 
4 shows a frequency distribution of the 65 error scores. From it we con- 
cluded no more than that the scores are adequately distributed for our 
purposes and that the curve promises to follow a normal form. 

A major difficulty was encountered by many of the committee mem- 
bers who tried out the test, in the time required to complete 80 matches. 
The average time per match was 1.16 minutes, which gives an average 
total time for 80 matches of 22.8 minutes. Obviously, an hour and a 
half is too long for this sort of test. Many subjects could not be got to 
do the full 80 judgments in one sitting, nor could they return easily for a 
second session. This is why three times as many cases were obtained, in 
which only one-half the test was completed. We reéxamined the data 
from the 65 cases for an indication of how we might interpret the half- 
tests of only 40 jugdments. We correlated the whole scores with part 
scores with the results shown in Table 1: 











Table 1 
Correlations 
r PE 
80 jud. with 40 judg. 
(1 to 20) + (61 to 80) 88 .016 
80 jud. with 20 judg. 
(1 to 10) + (71 to 80) 75 .036 
10 jud. with 10 judg. 
(1 to 10) + (71 to 80) .53 .06 





The intercorrelations could have been made in many different ways, but 
we carried them no further since these gave sufficient indication that we 
could reduce the length of the test without materially altering the distri- 
bution of the results. 

A fractionated distribution of errors as shown in Table 2 gives further 
indication of the fundamental homogeneity of the test. Therefore, we 
divided the ‘‘Preliminary Sete”. nto two sets each; one including chips 
1-10, 21-30, 41-50, 61-70 and the other the remaining 40 chips. The 
standard tests consist, then, of duplicate sets of 40 chips each, one perma- 
nently mounted on a neutral gray background and the other unmounted. 
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The gray background forms the inside of a shallow wooden box 12 by 10 
by 1 inches. The chips themselves are 14% by 1% inches. These di- 
mensions allow us to fasten them in 4 rows of 5 chips in each of the two 
halves of the background with 4 inch between chips in the rows. Sub- 
jects must match a loose chip to a fixed one by placing it below the latter. 
This is accomplished both by instruction and by having a } in. strip of 
gray cardboard along the top edge of every row. Other spatial factors 
are kept constant by using the same “haphazard” arrangement of the 
fixed array. This “haphazard” array is not entirely chance. The yel- 








NUMBER OF CASES 








Fie. 4. Distribution of 65 scores based on 80 matching judgments 
with unlimited time. 


lowish-red and bluish-red chips are alternated both horizontally and 
vertically, and care is taken that near matching chips are well scattered 
over the field. 

The time factor presented several further problems which required 
other modifications of procedure. While the reduction of the total num- 
ber of judgments from 80 to 40 brought the average time for the test 
within acceptable limits, individual times scattered widely about that 
average. Speed of matching may be as important a factor in color apti- 
tude as accuracy. We sought to take speed into account by weighting 











Table 2 
Fractionated Distribution of Errors 
Yellowish-red Bluish-red 
Chip nos. 1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80 
Errors 447 424 372 336 322 337 477 457 








Totals 1579 1587 








— 
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scores with a factor obtained from the ratio of the individual time to the 
average time. 
(Av. time — ind. time 


Weighted score = per cent correct + ae 
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Fic. 5. Distribution of 338 scores using the final procedure with 30 min. time limit. 





Thus the weighted scores became a measure of accuracy in matching 
modified by the rate of matching. 

This device, however, did not bring the actual time ior administering 
the test within desired limits nor stabilize it at a specitic time and was 
given up in favor of a fixed time of 30 minutes, which is sufficiently below 
the unrestricted average time (45 min.) so that the majority of subjects 
may be expected not to complete the test. Fixing a time limit introduces 
the possibility of an effect by the order in which the chips are presented 
for matching. Some matches may be easier to make than others. When 
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all 40 chips were presented, this factor could be neglected, but now the 
particular chips matched within the time limit may be an important 
determinant of the score. Therefore, the loose chips are presented in a 
specified order so that all subjects match the same chips up to any point 
in the course of the test. 

Complete instructions for giving the test are as follows: 


“The set of 40 mounted chips which constitutes the Matching Field should 
be laid out flat in diffuse illumination of daylight quality Gaatvaliens to LC.I. 
‘Illuminant C.’ The illumination should be in the neighborhood of 50 foot 
candles and as uniform as ible over the whole area of the ‘matching field.’ 
Care must be taken that the illuminant does not shine into the subject’s eyes 


200 


PERCENTILE 
2s 6 £ 3 


10 





Cc 20 0lUmsttmlC CSC asi ECSU SECC CiCaHECOC BCC 


SCORE 
Fic. 6. Percentile distribution of 338 scores. 


and that no shadows are cast upon the ‘matching field.’ Images mirrored 
by the glossy surfaces of the chips can be eliminated by proper placing of them 
with inten to the illuminant and the subject. 

“Procedure: Give the subject the first one of the unmounted chips to be 
matched and instruct him as follows: 

“Among the colored chips laid out before you, there is a mounted one 
which matches exactly each one of the unmounted set. Find the mounted 
chip which matches each unmounted one. Do not hurry, but work right 
along, because your score will be determined partly by your speed in matching. 
You will be allowed to make as many matches as you can in 30 minutes. Your 
score will be best if you use all the time allowed. 

“‘When the subject has found the match, enter the code letters of the chip 
in the corresponding place on the cacrge Bye put the first chip back in the 
box and give the Subject the second chip. Note that spaces are left on the 
Scoring for recording several unmounted matches to every mounted chip, 
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but the code symbol of an unmounted chip may be recorded only once. There 
will be some blank spaces on the completed Scoring Chart. 

“The subject may move a ‘matching chip’ about in order to compare it 
with any chip in the ‘matching field.’ (1) He must place it below the field 
chip with which he is comparing it, and in contact with it. (2) All matches 
must be made with both chips flat on the surface of the background. (3) The 
subject must view the chips from a distance of not less than 10 inches. (4) The 
subject matches only one chip at a time.” 


We now have 338 test scores obtained with the final test procedure. 
The subjects include expert colorists, textile students, industrial workers, 
office workers and college students. Figure 5 shows the distribution of 
scores and Figure 6 their percentile distribution. The curves approxi- 
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Fic. 7. Distribution of the accuracy factor, score per match, obtained by dividing the 
score made in 30 min. by the number of matches; 338 scores. 


mate the normal curves nearly enough to indicate the general character 
of our sample population. An attempt was made to obtain other in- 
formation concerning the color proficiency of subjects, with which to 
correlate test scores. A rating scale of nine steps from “Exceptional” 
to “Poor” “Expertness in Color Matching” was printed in every score 
sheet, but it was rarely completed because the information was lacking 
or not readily available to the person who was giving the test. It ap- 
pears that these highly desirable ratings can not be obtained on a general 
basis. This, indeed, is the reason for the test. It should be possible, 
however, to obtain them within more limited groups, such as a particular 
industry or a single laboratory or company. Even then a correlation 
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Fic. 8. Percentile distribution of the accuracy factor. 
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the’score made in 30 min. by 30 min. (or by the actual time when the test was com- 
pleted in less than 30 min.). N = 338 cases. 
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will measure the judgment technique and the validity of individual 
judges. 

In the discussion of the procedure, we pointed out the desirability of 
combining speed and accuracy in asingle score. For particular problems, 
it is interesting to know the contribution made by each of these factors. 
In order to separate them, we calculated the score per match and the 
score per minute for every subject. The distribution and percentile 
curves for the two factors are shown in Figures 7-10. Correlation be- 
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Fig. 10. Percentile distribution of the speed factor. 


tween them is positive, though not high, viz. r = + .35, PE, = .045, 
indicating that there is only a slight correlation and therefore one cannot 
be substituted for the other. 


Summary 


On the basis of the foregoing results we have found that matching 
judgments made within saturation series of finely graded steps give a 
well distributed set of individual scores upon which to establish ‘‘color 
matching aptitude” ratings. While as yet these ratings are related only 
hypothetically to practical performance on the job, there is agreement 
among those who have used the test that the results are significant. In 
the few cases where supervisors would give a definite rating to the color 
competence of a subject, the correlation was good. Within one company 
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which tested a large number of its employees, chemists, and dye workers, 
rated higher than laboratory technicans and office workers. 

The materials used in the first experimental test have been expended 
in this preliminary work, therefore the committee has produced new 
material for a “1944 Experimental Edition” of the test which is now 
available in a limited quantity. Since the major criticisms of the first 
form of the test was its limited range of hues, and its relatively low satura- 
tion, the new edition corrects these faults. No data have yet been ob- 
tained from its use, but the sets are now in the hands of the committee 
and results will be published as soon as they become available. 


Received January 3, 1945. 














Statistical Analysis of an Industrial Rating Chart * 


D. J. Bolanovich 
Radio Corporation of America 


Wherever extensive and continued use is made of a rating device, its 
value can be enhanced by thorough statistical analyses. This fact is, of 
course, very obvious and well-known to all who have had even slight 
training in the use of ratings. However in the industrial field, it 
is not unusual to find rating devices in use which are accepted without 
question as adequate measures of everything they specify. It is not un- 
usual to find preassigned scoring methods whose suitability has never 
been tested. Too seldom do those applying the ratings attempt to dis- 
cover what the scales really measure, and to recognize overlapping, bias, 
and those changes in intended scale values which are imposed unknow- 
ingly by raters themselves. When men are selected for promotion by 
such rating reports, superficial interpretations may do an injustice to 
employees and employers. At best the ratings are not as efficient as 
they might be. 

This report describes the experience of one company, which uses a 
Personnel Rating Chart, and which endeavors to obtain maximum effect- 
iveness of its interpretation through statistical analysis. The Personnel 
Rating Charts were developed to meet the company’s need for records 
upon which to base promotion of field personnel to key managerial and 
technical positions. The subjects of ratings used in this study were 143 
field engineers who service electronic equipment throughout the United 
States. Raters were 11 district managers supervising the engineers. It 
is the company’s policy to rate all personnel other than higher manage- 
ment every six months. The foregoing analysis was made of the first 
semi-annual ratings. 

The confidential nature of information on the charts prevents their 
illustration here. However, a brief description of the charts may make 
this discussion more understandable. Items composing the charts were: 
(1) Personality, (2) Personal Appearance, (3) Punctuality, (4) Thorough- 
ness, (5) Efficiency, (6) Resourcefulness, (7) Dependability, (8) Coopera- 


* This rating experiment was based on the experience of the R.C.A. Service Com- 
pany, a subsidiary of the R.C.A. Victor Division of the Radio Corporation of America. 
Acknowledgment is made to Mr. A. Goodman, assistant general manager of the com- 
pany, who made the study possible, and Mrs. E. Fish, who tabulated the data. 


23 





24 D. J. Bolanovich 


tion, (9) Job Attitude, (10) Technical Ability, (11) Sales Ability, (12) 
Organizing Ability, (13) Judgment, and (14) Desire for Self-Improve- 
ment. These items were deemed most important by company manage- 
ment for promotion to supervisory and specialist positions. Each item 
was carefully defined in terms of observable behavior. Each item was 
followed by five boxes numbered from 1 to 5, and representing continuous 
intervals on a scale. The meanings and limits of the intervals were indi- 
cated at the top of the chart. An attempt to curb bias was made by 
changing the order of the numbered boxes from item to item. An 
example of two such items would be: 





1. Personality: Cheerfulness and pleasant- 
ness in relations with others. Extent 
of friendships with associates and cus- 
tomers. Ability to hold confidence and 
admiration. 





























. Resourcefulness: Success in handling 
routine and special problems without 
continual help. Activity in develop- 
ing new applications, finding new needs 
forequipment. Suggestions for equip- 
ment, methods, procedures, and making 
new products. 






































Where the key to the scales reads: 5 = Excellent (among the best 
10%); 4 = Superior (among the best 1/3, but not in best 10%) ;3 = Good 
(in the middle 1/3); 2 = Fair (in the lower 1/3, but not in lowest 10%); 
and 1 = Poor (among low 10%). 

An additional item asked, ‘Would you recommend this employee for 
a@ more responsible assignment?” and requested specific information. 
Other information was asked for on the reverse side of the sheet. Each 
supervisor making ratings was given an explanatory chart for reference 
which discusses the purposes and uses of the personnel ratings, and con- 
tains helps for effective rating. 

After completion of one set of ratings, duplicate copies of the charts 
were forwarded to the home office, where all ratings were punched on 
IBM cards for analysis. Distributions of ratings for each item, and 
intercorrelations between each pair of items were calculated on IBM 
sorters. Correlations were also determined between “recommendation 
for more responsible assignment” and each item. These latter correla- 
tions were biserial. Inter-item correlations were Pearsonian. The data 
thus obtained were processed as follows: 
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(a) A factor analysis was made to determine the extent to which 
common factors accounted for the variance in ratings. 

(b) Using “Yes” and “No” responses to the question, ‘Would you 
recommend this man for a more responsible assignment?” as a 
criterion, a multiple correlation was obtained between the best 
combination of items and the criterion. 

(c) Methods of scoring the charts were experimented with to find a 
statistically sound scoring procedure. 


Factor Analysis 


Factor analysis of the items was conducted according to Thurstone’s 
Centroid Method.' The extraction of factors was continued until 
McNemar’s criterion * was satisfied. McNemar proposes that when the 


standard deviation of the partial residuals reaches or falls below a the 


TH 
magnitude of residuals may be considered as due to chance sampling 


errors in the original intercorrelations. The present extraction of factors 
was at first halted after five were obtained, at which time the S. D. of 


partial residuals was only .039 greater than a However, when rota- 
tion failed to reduce the items loadings on some of the factors satisfactor- 


ily, a sixth factor was extracted. Inclusion of the sixth factor improved 


the effectiveness of rotation in reducing factor loadings to zero and elimin- 
ating negative loadings. 

Tables 1 and 2 show factor loadings, communalities, and uniqueness 
of items before and after 15 completed rotations. It appeared that any 
further rotations would not be effective in clarifying the factors further. 
Rotations were made using two axes at a time and computing new load- 
ings after each rotation.’ All six factors had some item loadings greater 
than .40. The factor F; contained 7 items whose loadings were not 
significantly greater than zero; F; contained 3 such; F; contained 9; F, 
contained 3; F's contained 7; and F, contained 4. Except for the factor 
F,, not many items were heavily loaded with any given factor. This 
made interpretation of the factor meanings not too difficult a job. 

In attempting names for the factors, the definitions given for the items 
were carefully considered. Factor Fi, for example, has heaviest loadings 
in Personal Appearance and Thoroughness, which seem difficult to recon- 
cile. However, on the chart Personal Appearance is defined as “‘Careful- 


1 Guilford, J. P., Psychometric methods. New York: McGraw-Hill Book Co., 1936, 
pp. 478-508. 

* McNemar, Q., On the number of factors. Psychometrika, 1942, 7, 9-18. 

* Guilford, J. P., op. cit., p. 502. 
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Table 1 
Factor Loadings and Communalities for the Fourteen Ratings 
Before Rotation of Axes 

Item F, F; F; F, F; F, h? 

1. Personality 729 —.377 —.144 .213 .210 155 .81 
2. Personal Appearance 460 —.437 —.438 034 167 —.172 .65 
3. Punctuality 654 .078 —.186 .226 —.386 -—.021 .67 
4. Thoroughness -733 .259 —.306 -—.186 -—.152 —.044 .76 
5. Efficiency 808 .207 —.102 —.184 .008 —.158 .76 
6. Resourcefulness .730 130 -—.114 —.201 301 124 .70 
7. Dependability 826 .247 —.088 136 —.129 135 .80 
8. Cooperation .797 —.197 .170 161 059 112 .74 
9. Attitude Toward His Job .767 —.232 218 —.069 —.148 124 .73 
10. Technical Ability .668 318 222 —.279 244 212 .78 
11. Sales Ability 559 —.274 162 —.260 -—.239 -—.163 .56 
12. Organizing Ability -762 .133 .128 .084 .000 —.201 .66 
13. Judgment .766 040 .255 .167 189 —.178 .75 


14. Desire for Self-Improvement .596 125 161 220 —.076 199 .49 





ness in dress and posture.” It seemed that F), then, was a “‘Meticulous- 
ness’ or “Attendance to Detail” factor. F; which is highly loaded in 
Technical Ability, Resourcefulness, and Efficiency, would seem to be 
similar to the factor reported by Ewart, Seashore, and Tiffin ‘ as ‘‘Ability 
to do present job.” Fs, with highest loadings in Sales Ability and Atti- 
tude Toward Job, was termed a “Sales Ability” factor. Selling is one 
important aspect of the work of field engineers. Fy, is difficult to name 
since many items measure it. It would seem that a factor of “Job Con- 
scientiousness” runs through those items with large loadings. Fs was 
termed an “organizing” or ‘systematic’ factor since it was found largely 
in the items Judgment and Organizing Ability. Judgment is defined on 
the charts as “ability to make good decisions based on facts.” Fs seems 
to be a “Social Intelligence” factor. Its heaviest loadings are in Per- 
sonality, Cooperation, Judgment, and Desire for Self-Improvement. 

These names for factors, of course, are subjective “best guesses.” 
It might be interesting to note a simple check made on the appropriate- 
ness of factor names. A local university psychology instructor wrote 
down the six suggested factor names. Then without knowledge of factor 
loadings, he wrote his opinion as to whether an item would receive a 
high, low, or intermediate loading in each factor. His guesses seemed to 
approximate actual figures rather.closely. He experienced most difficulty 
with factor F, which was then termed “Executive Ability.” 


‘ Ewart, E., Seashore, 8. E., and Tiffin, J., A factor analysis of an industrial merit 
rating scale. J. appl. Psychol., 1941, 25, 481-486. 
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Table 2 


Factor Loadings, Communalities and Uniqueness After Fifteen 
Completed Rotations 








Item F 1 F, F; F, F; F 6 h? 





u? 
1. Personality .380 .226 .370 = .137 =.230 .634 .80 .20 
2. Personal Appearance 572 —.016 .339 .077 .378 .261 .66 .34 
3. Punctuality 334 —.001 058 .683 .086 275 .66 .34 
4. Thoroughness 498 389 020 .568 .147 005 .74 .26 
5. Efficiency 341 501 030 .508 .355 088 .76 .24 
6. Resourcefulness .360 .664 083 .206 .183 .237 .71 .29 
7. Dependability 318 380 —.026 .636 .061 387 .80 .20 
8. Cooperation .086 331 .297 .377 .253 575 .74 .26 
9. Attitude Toward His Job .016 343 461 .491 .190 349 .73 .27 
10. Technical Ability .016 812 -—.002 .287 .078 179 .78 .22 
11. Sales Ability — .004 .190 476 427 3243 -.010 .56 .44 
12. Organizing Ability 107 .339 -—.017 .513 .419 305 .66 .34 
13. Judgment .006 .389 001 .364 .493 A73 .75 .25 
14. Desire for Self-Improvement .018 .276 O11 .454 .003 455 .49 .51 





In addition to these common factors, the items Desire for Self-Im- 
provement and Sales Ability seem to have high uniqueness values and 
may represent specific characteristics. In view of the fact that reliabili- 
ties cannot be determined for these ratings, Specificity is not obtainable. 
However, an attempt was made to estimate roughly the Specificities in 
the following manner: According to Thurstone’s formulae,' the reliability 
of an item will be at least as great as its calculated communality. Thus, 
the reliability of the item Personality would be at least .80. From the 
formula for specificity, S* = r — h?, we would get a specificity of zero for 
Personality. Then, it was assumed that since other items are equally or 
more objective than Personality, they would have reliabilities of at least 
.80 also. By substituting .80 for reliability in the above formula for 
each item, specificities were estimated. This method is only roughly 
approximate, but gives some idea of the magnitude of specificity values. 
The items with largest estimated specificities are given in Table 3. It 
appears that Desire for Self-Improvement and perhaps Sales Ability do 
measure independent characteristics. 

It might be well to point out here some comparisons between this 
study and that done by Ewart, Seashore, and Tiffin, since the latter has 
influenced greatly the thinking of those who use rating scales. While the 
present study found six common factors, that of Ewart, Seashore, and 
Tiffin found three (one of which was discarded as unreliable). The 
traits rated differed somewhat, but were similar in many respects. The 


5 Guilford, J. P., op. cit., p. 477. 
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chart used by the three authors measured 12 traits. Perhaps the most 
reasonable explanation for the difference in results of these two studies 
lies in the nature of the work being rated. The earlier study dealt with 
direct production workers, while this study deals with field engineers 
whose work requires a wide range of abilities and characteristics. For 
example, field engineers sell, they contact people, they make reports, 
meet many new problems and require highly specialized training. With 
factory workers, supervisors are largely influenced by quantity and 
quality of work done, and perhaps by attitudes of employees. There are 
some further reasons that may account for the greater number of factors 
in the charts for field engineers: (1) The district managers are highly 
interested in developing the all-round potentialities of their employees, 


Table 3 
Estimated Specificities* of Items Having Smallest Communalities 











Item Estimated Specificity 
Desire for Self-Improvement 31 
Sales Ability .24 
Organizing Ability 14 
Punctuality 14 
Personal Appearance 14 
Resourcefulness 09 





* Estimated on the basis of reliabilities assumed equal to the communality for 
the Personality item. 


(2) The managers had been accustomed to reporting on various traits of 
field men, (3) Efforts were made to break up possible bias or halo effects, 
as can be seen in the above discussion of the nature of the chart used. 


Relationship of Items to Overall Performance 


Table 4 shows the correlations between individual item ratings and 
responses to the question, ‘‘Would you recommend this man for a more 
responsible assignment?” It is assumed here that “Yes” and “No” 
represent a dichotomous division of a continuum of degree of recommend- 
ation. Furthermore, to be a practical criterion, recommendations must 
represent present job success and not peculiar fitness for some specific 
job. Since it is a general practice for supervisors to recommend for 
promotion those workers performing best on their present jobs, this 
criterion should be sufficiently valid. 

The highest single correlation (biserial) with the criterion was .59, 
which was shown for each of the five items: Personality, Efficiency, 
Thoroughness, Job Attitude, and Organizing Ability. The lowest was 
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Table 4 


Biserial Correlations of Item Ratings With “Yes” or ‘‘No” Responses to “Would You 
Recommend This Employee for a More Responsible Position?” 























Item T bis 

1. Personality 59 

2. Personal Appearance 43 

3. Punctuality .33 

4. Thoroughness .59 

5. Efficiency .59 

6. Resourcefulness 58 

7. Dependability 53 

8. Cooperation .39 

9. Attitude Toward His Job 59 

10. Technical Ability 46 

11. Sales Ability 49 

12. Organizing Ability 59 

13. Judgment 48 

14. Desire for Self-Improvement .24 

Table 5 
Matrix of Item Intercorrelations* 
ie atk cot cee i ath ae al me a ee 
1. Personality .70 
2. Personal Appearance .60 .60 
3. Punctuality 42 .29 .68 
4. Thoroughness 46 .27 58 .73 
5. Efficiency 48 41 48 .73 .73 
6. Resourcefulness 49 .38 .36 .58 .64 .67 
7. Dependability 53 .27 .68 .70 .63 .63 .70 
8. Cooperation .70 .36 .49 46 .53 .56 .62 .70 
9. Attitude 61 .33 48 .46 54 48 62 .67 .67 
10. Technical Ability 39 .05 .26 .51 .62 .67 .57 48 .51 .67 
11. Sales Ability 33 .30 .37 .38 .43 33 .38 48 .58 31 .58 
12. Organizing Ability .49 .26 49 .59 .60 .51 .67 .57 .56 .52 .44 .68 
13. Judgment 54 .29 44 43 .63 .52 .61 .69 .59 58 .38 .68 .69 
14. Desire for Self- 
Improvement 44 12 43 34 51 37 .53 .54 46 44 .22 45 48 .54 





* Item self-correlations (Reliabilities) are given as equal to the highest intercorrela- 
tion in the row and column in which the item appears. 


.24 between the criterion and Desire for Self-Improvement. A maximum 
multiple correlation of .81 was obtained between the criterion and a 
combination of the following 7 items: Personality, Efficieacy, Resource- 
fulness, Cooperation, Job Attitude, Sales Ability, and Organizing Ability. 
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Here the Wherry-Dolittle technique * was used to select items and was 
continued until an additional item would have added only .007 to the 
multiple correlation coefficient. The regression equation for estimating 
the degree of recommendation for a more responsible assignment is: 


X. = 3.8 (Personality) + 1.0 (Efficiency) + 3.8 (Resourcefulness) 
— 4.8 (Cooperation) + 2.1 (Job Attitude) + 2.3 (Sales Ability) 
+ 2.5 (Organizing Ability). 


Method of Scoring 


Several possibilities were considered for scoring the charts. First, 
they could be scored by simply adding the ratings for each scale. Second, 
a total score could be obtained by weighting each item in proportion to 
its regression equation coefficient. Third, weights might be assigned 
items which would yield scores for each of the factors found by factor 
analysis.’ It was decided that factor scores would not be of much 
practical value. 

In order to test the relative effectiveness of scoring by straight item 
summing and by using regression weights, charts for 94 field engineers 
were scored both ways. Biserial correlations were then calculated be- 
tween total scores and recommendations for more responsible assign- 
ments. As might be expected from the multiple correlation coefficient, 
total scores by the regression equation method correlated .83 with recom- 
mendations. Total scores obtained by adding all 14 item ratings cor- 
related .75 with recommendations. 

As a result of these findings, it was decided to score the charts by 
assigning regression weights. 


Summary and Conclusions 


1. Fourteen-item rating charts for 143 field engineers, rated by 11 
district managers, were analyzed to determine: (a) the common and 
unique factors operating, (b) the multiple correlation between rating 
items and overall job success, and (c) possible methods resulting from 
these analyses for scoring and interpreting the ratings. 

2. Factor analysis showed six common factors measured by the 
scales. These were named: Attendance to Detail, Ability to do the 
Present Job, Sales Ability, Conscientiousness, Organizing or Systematic 
tendency, and Social Intelligence. 

3. The items Sales Ability and Desire for Self-Improvement had 


* The Wherry-Dolitile test selection method (courtesy of R. J. Wherry). Work- 
guide used by U. 8. Employment Service Division, Occupational Analysis Section. 

7 Thomson, G. H., The factorial analysis of human ability. New York: Houghton 
Mifflin Co., 1939, pp. 107-110. 
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large uniqueness values and are probably measuring significant specific 
factors. 

4. When a multiple regression equation was computed for predicting 
recommendations for promotion, 7 of the 14 items were included as 
adding significantly to the multiple correlation. The multiple R was 
81, and included the items: Personality, Efficiency, Resourcefulness, 
Cooperation, Job Attitude, Sales Ability, and Organizing Ability. 

5. Total scores obtained by adding the 14 item ratings yielded a cor- 
relation of .75 with recommendations for more responsible assignments. 
Total scores obtained by summing the products of item scores and their 
regression weights for 7 items yielded a correlation of .83 with recommend- 
ations. The latter method was selected for assigning overall scores. 

6. These findings and their application will aid the company in its 
use of the scales in the following ways: 

(a) An objective method is made available for immediate determination 
of relative all-round performance of employees. 

(b) Relationships of various items to job success give some idea of their 
relative importance. 

(c) In considering individuals, factor analysis findings give a better idea 
of the meanings of item scores. 


Received December 6, 1944. 








A Simplified Form for Reporting Test Results * 
Brent Baxter 
Owens-Corning Fiberglas Corporation, Toledo, Ohio 


and 


Evelyn Potechin 
Ohio State University 


The success of a personnel testing program in an industrial or govern- 
ment agency depends to a large extent on the efficiency with which test 
information is conveyed to the operating officials such as supervisors, 
employment interviewers, counselors, and training instructors. These 
individuals are usually untrained in psychological testing and are unable 
to understand readily any technical aspects of a testing program or to 
understand the significance of test scores as applied to personnel prob- 
lems. If they are to derive any benefit from the use of test results, the 
test technician must present the results to them in an easily understood 
manner. This paper describes a “Test Report Form” that has been 
found effective in expressing a test score or a set of test scores for an 
individual to non-psychologically trained operating officials. 

Several devices are in current use for translating the raw scores of 
tests to something more meaningful. Among these are the standard 
score, the percentile, descriptive phrases (e.g., excellent, good, average, 
poor, very poor), and the profile. The Test Report Form (Fig. 1) is a 
combination of all these devices. It is essentially a graphic rating scale 
on which the check marks are positioned according to percentiles or 
standard scores based on objective test scores. 

Spaces are provided at the top of the Form for the examinee’s name 
and other identifying data. Beneath this is a brief explanation to the 
operating official to guide him in understanding the Form. In the left- 
hand column are listed by name all tests which are in use in the program. 
Brief phrases further describing what the test measures may be placed 
beneath the name of each test. The list of tests shown in Figure 1 was 
adjusted to include all those which were in the regular available series 


* The authors wish to express their appreciation to Dr. Charles C. Gibbons whose 
suggestion prompted the development of the Form, and to Mr. Roger T. Lennon who 
offered several useful criticisms during the construction of the Form and in reading the 
manuscript. 
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for clerical workers in a particular government agency and was designed 
for use with other materials which explained the purpose of each test. 
Since in most cases an individual does not take all tests in the list, the 
names of the tests administered re encircled in red for ready identifica- 
tion. 

Opposite the name of each test is a graphic scale, along which five 
phrases descriptive of various levels of performance are placed in order. 
Thejtest results ‘are recorded on these scales in terms of percentiles, using 
the small dots on the lines as guides to indicate the deciles. A testing 
clerk may record the results by noting the individual’s percentile and 
placing a red check mark on the line at the point which conforms to the 
proper percentile. The supervisor interprets the test results directly in 
terms of the descriptive phrases and does not have to have the concept 
of percentiles explained to him. For those who wish to know the per- 
centile score, it may be read fairly accurately from the position of the red 
check mark. The same procedure can be adapted to the use of standard 
scores. 

On the present Form the extreme phrases each refer to about seven 
percent of the norm group, the second and fourth phrases each pertain to 
23 per cent, and the center phrase concerns the middle forty per cent. 
The decile dots have been spaced along the lines so that equal spaces 
along each line refer to equal increments in ability. The total line length 
represents five standard deviations (plus and minus two and one-half 
standard deviations) .! 

No attempt is made to join the red check marks representing perform- 
ance on the various tests as on a profile chart or psychograph. The rela- 
tive standing on the various tests is clear without a joining line, which 
would only be confusing in this situation; a line joining check marks for 
non-adjacent test scales will cross the scale for another test which may 
not have been administered and suggest that scores have been obtained 
for the intermediate test. It is important, moreover, that if an individ- 


1 The following method may be used to obtain this system of spacing: (1) determine 
the length of line most convenient for the size of paper to be used (6 inches is suitable 
for paper 8 inches wide); (2) determine the length represented by a standard deviation 
by dividing the line length by 5; (3) place the 5th decile dot (mean) in the center of the 
line; (4) place the 6, 7, 8, and 9th decile dots to the right of the mean by .25, .52, .84, 
and 1.28 of the standard deviation length (obtained in 2) respectively; (5) place the 
4th, 3rd, 2nd, and Ist decile dots to the left of the mean by .25, .52, .84, and 1.28 of 
the standard deviation length respectively. In arranging the descriptive phrases, the 
extreme phrases at the right and left are placed beyond plus and minus 1.47 of the 
standard deviation length respectively. The second phrase is placed between the end 
phrase and the 3rd decile dot; the middle phrase is between the 3rd and 7th decile 
dots; and the fourth phrase is between the 7th decile dot and the extreme phrase 
(+1.47 sigma). The spaces allotted to the phrases are only approximately equal. 
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ual’s standings on the various tests are to be compared, the groups on 
which the tests’ percentiles are based should be comparable. 

At the bottom of the Form is a space for “Comments.” This is used 
for making recommendations regarding hiring, transferring, upgrading, 
etc. Any unusual results or “highlights” of the individual’s performance 
are also discussed here. 

The Test Report Form cannot be used, however, as a substitute for 
an explanation of tests and their uses. A careful explanation of the 
advantage and limitations of test results in general as well as of the 
specific tests used should be given to any non-technician who réceives 
the Test Report Form as part of the basis for any personnel action. 

Some difficulty may be encountered in devising descriptive phrases 
for each of the tests but no more so than in any graphic rating scale. 
Care must also be exercised in seeing that where ever “‘absolute” descrip- 
tive phrases (e.g., can learn fairly complex duties) are used beneath the 
relative percentile scale, the absolute terms have a valid meaning. For 
an extreme example of this kind of error, if the typing test norms were 
based on a group of first-class typists, even typists in the 3rd percentile 
could not be said to be very slow or very inaccurate. If the results of 
only one or two tests are to be presented, it is possible to arrange a form 
whose vertical axis gives a description of absolute performance and whose 
horizontal axis shows the standing ina group. This two-axis form, how- 
ever, becomes rather complicated and more difficult to explain. 


Summary 


Some advantages of using the Test Report Form include the follow- 
ing: 

1. Standard interpretations of the test results are recorded for each 
test. 
2. Once the Form has been arranged, the marking of the test inter- 
pretation is very easy and can be done by a testing clerk. 

3. No statistical knowledge is required on the part of the operating 
official who uses the test information. It is easily explained to any 
“reader’’. 

4. It avoids any system which separates the total range of scores into 
a limited number (e.g., five) of groups. The concept of continuity of 
performance becomes more apparent. 

5. The Form may be used as the testing unit’s permanent test record 
if the raw score is placed beneath the name of the test. 


Received|DecemberU13, 1944. 





A Comparison of the Reliability and Performance for the 
Minnesota Rate of Manipulation Test for Subjects 
Tested Individually and in Groups of Two 


Jacob Tuckman 
Jewish Vocational Service, Cleveland, Ohio 


The Minnesota Rate of Manipulation Test ' is used to select workers 
for office and factory jobs where speed of hand and finger manipulation is 
important. The apparatus is a wood board containing 60 cylindrical 
holes, arranged in four rows of fifteen into which 60 slightly smaller blocks 
can be placed. The test consists of two parts: Placing, in which the sub- 
ject, using one hand, places the blocks into the holes in a definite order 
from a fixed position; and Turning, in which the subject picks up a block 
with the left hand, turns it over, and puts it back into the same hole with 
the right hand, alternating hands for each subsequent row. One practice 
and four test trials are given. The score is the number of seconds re- 
quired to complete the four test trials. 

The manual of directions accompanying the test states ‘‘It is advisable 
to have at least two people taking the test at the same time. The com- 
peting effect will stimulate each into doing his best, thus increasing reli- 
ability.” Since, in practice, it is not always possible to test in groups of 
two or more, the purpose of this study is to determine whether there are 
differences in reliability and in test performance between subjects tested 
individually and those tested in groups of two. 

Test scores for Placing and Turning were available for 255 boys and 
208 girls tested individually, and 185 boys and 200 girls tested in groups 
of two. For those tested individually the mean age was 16.0 for both 
boys and girls; for those tested in groups of two the mean age was 16.3 
for boys, and 15.9 for girls. The four groups were superior in intelligence 
as measured by the ACE Psychological Examination for High School 
Students and College Freshmen (1939-1942 editions), and the Terman 
Group Test of Mental Ability. For those tested individually, the mean 
percentile rank for intelligence was 80 for boys and 76 for girls; for those 
tested in groups of two, the mean percentile rank was 78 for boys and 75 
for girls. All were enrolled in a college preparatory course in several 
senior high schools or were enrolled in junior high schools normally leading 


1 Developed by W. A. Zeigler and distributed by The Educational Test Bureau, 
Inc., Minneapolis, Minnesota. 
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to this course of study. Each of the four groups was about equally dis- 
tributed in grades 9-12. The median grade for each of the four groups 
was 10A (latter half of the 10th grade). 

The reliability coefficients for Placing and Turning for the four groups 
are presented in Table 1. The coefficients were obtained by correlating 
scores on Trials 1 and 3 and scores on Trials 2 and 4, and corrected by 
applying the Spearman-Brown prophecy formula. For Placing and 
Turning, the reliability coefficients tend to be higher for both boys and 
girls tested individually. These data are not in agreement with Zeigler’s 


Table 1 


Split-half (Trials 1 and 2 and Trials 3 and 4) and Corrected Reliability Coefficients for 
Placing and Turning for High School Boys and Girls Tested 
Individually and in Groups of Two 


Placing Turning 


Corrected 
P.E.r P.E.r r* 




















Individually 

High School Boys d .0098 d ! 0059 
High School Girls J .0096 x f .0092 
Combined Group ‘ .0068 d d .0049 


In Groups of Two 

High School Boys 87 0121 .93 91 
High School Girls 87 0119 .93 84 
Combined Group 87 0085 .93 88 


.0083 


0076 





* Corrected by Spearman-Brown prophecy formula. 


findings. In comparing the combined groups, the difference is greater 
for Turning than for Placing. The a is 1.51 in favor of those 


tested individually for Placing and 3.98 for Turning, but these differences 
are not statistically reliable. 

The mean, standard deviation, and skewness of the distribution for 
Placing and Turning for the four groups are given in Table 2. The com- 
parison of the mean scores for the groups is presented in Table 3. 

Although the reliability is not increased, the performance of subjects 
tested in groups of two is faster than that of subjects tested individually. 
These differences are statistically significant when the performance of 
boys or girls tested in groups of two is compared with that of boys or 


girls tested individually. In comparing the combined groups, the — is 
9.1 for Placing and 5.6 for Turning, in favor of those tested in groups of 
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two. For those tested individually, the performance of boys and girls is 
almost identical for Placing; girls are faster for Turning. For those 
tested in groups of two, girls are faster than boys for Placing and Turning. 
These sex differences are not significant. In comparing the performance 
of the combined groups with the norms of the Educational Test Bureau, 
the mean score for those tested individually is equivalent to the 48th 
percentile for Placing, and the 63rd percentile ? for Turning. For those 
tested in groups of two, the mean score is equivalent to the 67th percentile 
for Placing, and the 74th percentile for Turning. 


Table 3 
Comparison of the Mean Scores for Placing and Turning for All Groups 








Placing Turning 








o diff. iff. D o diff. 





Individually 

Boys and Girls ‘ : j 3.7 2.05 
In Groups of Two 

Boys and Girls . . 2.9 1.73 
Boys (Individually) and 

Boys (In Groups of Two) . . . 8.0 2.04 
Girls (Individually) and 

Girls (In Groups of Two) . . P 7.2 1.73 
Boys and Girls Combined 

(Individually) and 

Boys and Girls Combined 

(In Groups of Two) 11.7 1.28 9.10 7.6 1.35 5.60 





The distributions of each of the four groups show a tendency for the 
scores to cluster toward the upper end of the scale. For boys tested 
individually the skewness is significant for Placing and Turning; for the 
other three groups the skewness is not significant. When the groups are 
combined the skewness is significant only for Turning for those tested 
individually. 

The noteworthy differences that exist between those tested individu- 
ally and those tested in groups of two for Placing and Turning warrant 
the establishment of separate norms for high school students. These are 
presented in Table 4. 


* For discussion regarding the tendency of subjects to perform more rapidly on 
Turning than on Placing see J. Tuckman, A comparison of norms for the Minnesota 
Rate of Manipulation Test. J. appl. Psychol., 1944, 28, 121-28. 
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Table 4 


Cleveland Jewish Vocational Service Norms for Placing and Turning for High Schoo 
Students (Grades 9-12) Tested Individually and in Groups of Two 








Placing Turning 
(Time in Seconds) (Time in Seconds) 
In Groups of In Groups of 
Two Individually Two 
N = 385 N = 463 N = 385 


189.0 185.9 144.6 145.7 
203.4 195.1 154.2 153.0 
210.4 200.6 159.6 157.4 
213.5 205.4 163.0 160.7 
216.6 208.6 166.3 162.6 
219.7 210.9 168.8 164.5 
222.8 213.3 171.3 166.5 
225.9 215.4 173.8 168.7 
228.3 217.5 176.4 171.0 
219.5 178.9 173.1 
221.5 181.5 175.2 
223.5 184.1 177.2 
225.7 186.7 179.2 
228.2 189.3 181.1 
230.8 192.5 183.1 
233.6 196.0 185.5 
236.4 200.7 188.4 
239.9 207.2 192.5 
244.1 213.4 198.3 
252.3 221.3 206.5 
265.7 247.5 226.9 








Individually 
N = 463 
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Summary 


The reliability of Placing and Turning is not increased when high 
school students are tested in groups of two, but the performance of these 
students on both tests is significantly faster than that of students tested 
individually. 

Received December 11, 1944. 





The Comparative Validities of Two Tests of General Aptitude 
in an Army Special Training Center * 


William D. Altus, Capt., AGD 
Camp McQuaide, California 


In a previous article by Bell and Altus,' the numerous objectives of 
an Army Special Training Center have been described. It is sufficient 
here to say that the main function of such a Center was to bring the trainees 
to a level in reading, writing and arithmetic which the Army recognized 
as literate. Literacy, as defined by the Army, may be roughly compared 
with the achievement of the average public school fourth grader in the 
tool subjects. The trainee had to reach this level within twelve weeks or 
be discharged as inapt. Very infrequently a trainee was shipped if 
he possessed a skill which would make him quite valuable to the Army, 
even though he was illiterate. 

For the first few weeks after the Ninth Service Command Special 
Training Center was organized (September, 1943), the test of general 
aptitude administered to the incoming trainees was the Wechsler-Belle- 
vue Intelligence Test. When it was finally possible to obtain a set of the 
officially sanctioned general ability scale, The Wechsler Mental Ability 
Scale, Form B, it was immediately put into use. Both tests are much 
alike, the second deriving from the first mentioned. Both are adminis- 
tered individually. 

Since the disposition of the trainee was practically dichotomous (a few 
were discharged for physical reasons), it was possible to compute validat- 
ing bi-serial coefficients of correlation for the various tests used by the 
Personnel Consultants’ Section of this Center. The respective validities 
of certain subtests of the Wechsler Mental Ability Scale, Form B, have 
been previously reported by Altus.? A recapitulation of the validities 

* The opinions expressed in this article are those of the author and are not to be 
construed as reflecting the official attitude of the Army of the United States. Ist Lt. 
Ephraim Yohannan, Pfc. Sidney Feinberg, Sgt. Carl Karasek and T/5 Grant Smith 
are to be credited with tabulating the original data presented in this article. Lt. 
Yohannan is responsible for the statistical work involved in the study. 

1 Bell, H. M., and Altus, W. D. The work of psychologists in the Ninth Service 
Command Special Training Center. Psychol. Bull., 1944, 41, 187-191. 

? Altus, W. D. The differential validity and difficulty of certain verbal and per- 
formance subtests of the Wechsler Mental Ability Scale. Psychol. Bull., 1945, 42, 
238-249. 
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for four verbal subtests will be found in Table 1. Also given in the same 
table are the validities for the five verbal subtests of the Wechsler- 
Bellevue. 

It will be noted that there are over five times as many cases involved 
in the validating coefficients for the Army Wechsler as for the civilian 
Wechsler. For that reason, greater confidence can be placed in the bi- 
serial correlations for the Form B Scale. It is noteworthy that the 
Arithmetic subtest of the Army version of the Wechsler is the most valid 
of the four subtests in use at this Center, while the Arithmetic subtest of 
the Wechsler-Bellevue was the least valid of the five subtests originally 
administered. The difference between these two coefficients is almost 
significant (D/P.E.p of 3.58). Apparently the validity of this subtest 


Table 1 
The Comparative Validities of Certain Verbal Subtests of the Wechsler Mental Ability 
Scale, Form B, and of the Wechsler-Bellevue Intelligence Scales in Predict- 
ing the Disposition of Trainees in an Army Special Training Center 








Form B Wechsler-Bellevue 
Subtest P.E.rpis N Tis P.E-rpis N 











Arithmetic J .018 1991 .290 .046 
Information r .018 1991 475 .044 
Comprehension , 019 1991 462 045 
Similarities ‘ .020 1991 323 046 
Digit Span* 442 045 
Total Scale , 017 1991 579 044 





* Digit Span was administered to the earlier group (Wechsler-Bellevue) only. 


was markedly improved through revision for Army use; perhaps one 
should rather say that its validity was markedly improved for trainees 
in an Army Special Training Center. 

The Information subtest proved to be quite valuable in both versions 
of the Wechsler. In Form B, Information takes second place only to the 
Arithmetic; in the original scales, it was the most valid of the subtests, 
its validity coefficient being even slightly higher than is Arithmetic for 
the Army version. Comprehension takes third place for the Army 
Wechsler; for the Bellevue it takes second. The original version of 
Comprehension is quite obviously a much better test for the restricted 
type of mentality found in a Special Training Center, the r,;, for the 
Bellevue being .102 higher. The Similarities subtest is least valid for 
Form B and has next to the lowest validity in the Bellevue Scales. The 
validating coefficients for this subtest are about the same, .334 and .323. 








William D. Altus 


In the form of the Army Wechsler employed here, the subtest on 
Digit Span was not administered. While the Digit Span test is not so 
valid as Information and Comprehension in the Bellevue, it is markedly 
better than Similarities and Arithmetic. It also has a higher validity 
than any of the four verbal subtests of the Army Wechsler, excepting 
Arithmetic. 

The Wechsler-Bellevue appears to have a somewhat better validity 
(.579, total scale) than the Army Wechsler (.553, total scale). This 
difference may, of course, be spurious. The inclusion of the relatively 
valid Digit Span test in the original version would tend to maximize the 
validity of the total scale used, especially if the intercorrelations of this 
test with the others were not high. It is probable that the validities of 
the two scales are approximately the same, when validity is defined as 
association with the criterion of the trainee’s disposition in an Army 
Special Training Center. 

One significant inference may, perhaps, be drawn from the data herein 
presented. It is that a quite valid scale which has been standardized 
upon the total range of intellect in a civilian population is also valid for 
the restricted mentality found among Army illiterates. It appears that 
revising such a scale for military use does not necessarily increase the 
validating coefficients to an appreciable degree. 


Received January 2, 1945. 





Use of the Shipley-Hartford Test in Evaluating Intellectual 
Functioning of Neuropsychiatric Patients * 


M. Erik Wright, Lt. (jg) H(S), USNR 
U. 8. Naval Hospital, Oakland, California 


The Shipley-Hartford Retreat Test ! was designed as an aid in detect- 
ing mild degrees of intellectual impairment in individuals of dull normal 
or higher original intelligence. The test also yields an estimate of the 
present level of intellectual functioning as well as an inference as to the 
prior level. The questions of intellectual level and of impairment are 
often significant in both the diagnostic and prognostic phases of neuro- 
psychiatric case-work. 

This study has two purposes: (1) To survey some of the intellectual 
abilities of a sample of hospitalized service personnel with neuropsychi- 
atric involvements, and (2) to determine the validity of the Shipley- 
Hartford Test as a basis for estimating intellectual level. 


Subjects and Procedure 


The subjects were 977 patients *? admitted to the Neuropsychiatric 
Service of a mainland Naval Hospital who had been routinely examined 
with the Shipley-Hartford Test during the first fortnight after admission. 
Most of these men had seen overseas duty, with a large proportion only 
recently returned from active combat areas. Special referrals for a more 
extensive intelligence examination (Wechsler-Bellevue Test) were made 
for 134 of these patients within a few weeks of the first test. In Table 1 
we have presented the age distribution of the 977 subjects. The average 
age was 27 years, and the range from 17 years to 64 years. 

The educational achievement of the group is shown in Table 2. 
Since original school records were not available the patient’s own report 
was used. Although such unconfirmed statements are subject to error, 


* This article has been released for publication by the Division of Publications of 
the Bureau of Medicine and Surgery of the United States Navy. The opinions and 
views set forth in this article are those of the writer and are not to be considered as 
reflecting the policies of the Navy Department. 

1Shipley, W. C. A self administering scale for measuring intellectual impairment 
and deterioration. J. Psychol., 1940, 9, 371-377. 

? The distribution of the group according to neuropsychiatric classifications cannot 
be published at this time because of war considerations. 
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Table 1 


Age Distribution of a Randomly Selected Group of Neuropsychiatric Patients 
Admitted to a Naval Hospital. N = 977 





Age 17-19 20-24 25-29 30-34 35-39 40+ 
Per Cent of Patients 16 38 17 10 9 9 








they permit a general estimate of the educational background of the 
subjects. 

The range in education achievement was from second grade to the 
master’s degree, with the average at the 10th grade. Almost 70% of the 
total group went beyond elementary school and a third either completed 
high school or went on to college. This is somewhat superior achieve- 
ment to that characteristic of the population as a whole and may be due 


Table 2 


Educational Achievement (own report) of a Randomly Selected Population of 
Neuropsychiatric Patients. N = 977 











Highest Grade 
Completed 2-7 8 9-11 12 13-15 16+ 
Per Cent of Group 15 16 38 23 7 2 





to the greater educational opportunities of the younger groups, to a tend- 
ency to over-estimate their achievement, etc. 


Derivation of Scores 


The Shipley-Hartford Retreat scale consists of two parts, a vocabulary 
test and an abstractions test. The vocabulary section is set up as a 
multiple choice test in which one of four alternatives has to be matched 
with the key word for best similarity. The abstraction test consists of 
unfinished problems. The subject is required to abstract the principle 
necessary to complete each one of them (e.g. 1 2, 3 5, 5 8 (7 1)). 

Four scores may be obtained from the Shipley-Hartford, the vocabu- 
lary score, the abstractions score, the total score and the conceptual 
quotient (CQ). Age norms have been determined for the first three of 
these. The total score is the sum of the vocabulary and abstraction 
scores. 

Conceptual Quotient is defined by the test constructor as follows: * 
“The CQ (conceptual quotient) Scale is based on the clinico-experimental 


* Manual of Directions. Shipley-Hartford Retreat Scale. Published by the Neuro- 
Psychiatric Institute of the Hartford Retreat, Hartford, Conn., 1940 (p. 2). 
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observation that in mild degrees of mental deterioration, and in other 
conditions involving intellectual impairment, vocabulary is relatively 
unaffected, but the capacity for abstract (conceptual) thinking declines 
rapidly. . . . Impairment is measured by the extent to which the indi- 
vidual’s abstract thinking falls short of his vocabulary. This deficit is 
expressed conveniently in the CQ (conceptual quotient).”’ 

A table is presented in the Manual whereby the CQ may be obtained 
from the given vocabulary and abstractions scores. 


“Original Intelligence” 
Performance on the vocabulary test may be used as an approximation 


of “original’’ intelligence ‘ for two reasons. First the vocabulary ability 


Table 3 


Performance on the Vocabulary Section of the Shipley-Hartford Test by 
977 Patients of a Neuropsychiatric Service 











Per Cent 

Raw of all Age Intellectual 
Score Patients Equivalents Level* 
37-40 3.1 19.8-21.0 Very Superior 
33-36 8.2 18.2-19.4 Superior 
29-32 20.8 16.6-17.8 High Average 
25-28 22.8 15.1-16.2 Average 
21-24 19.8 13.5-14.7 Average 
17-20 12.2 11.9-13.1 Low Average 
13-16 8.2 10.3-11.5 Borderline 
11-12 2.7 9.5- 9.9 Mental Deficiency 

1-10 2.4 Below 9.5 Mental Deficiency 





* The age equivalents ‘may be considered as mental ages. These have been trans- 
lated in terms of intellectual level on the basis of Terman and Merrill’s classification 
and distribution of adult mental ages. 


seems more resistant to change than the ability to do abstractions. 
Secondly, many studies have shown that the performance on vocabulary 
correlates more highly with general tests of intelligence than does per- 
formance of any other single test. In Table 3 the vocabulary scores and 
their age equivalents for the 977 patients included in this study are pre- 
sented. 

The mean vocabulary score was 24.8 which is equivalent to a vocabu- 
lary age of 15. Two-thirds of the patients (68%) fall within one sigma of 
the mean (Vocabulary Score from 17.9 to 31.7) and 97% fall within two 


‘ The term, original intelligence, as here used, refers to the intellectual level prior to 
impairment and has no implications for the nurture problem. 
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standard deviations (Vocabulary Score from 11.0 to 38.6). This indi- 
cates that the distribution closely approximates a normal distribution 
curve. : 

Table 3 also shows that three-fourths (74.5%) of this group of patients 
were of average or above average intelligence. This compares very 
favorably with Wechsler’s estimate of 75% based on a random sampling 
of approximately 1,000 adults in which a much more elaborate and refined 
testing instrument was used. From this it may be inferred that the basic 
intelligence of men who later became neuropsychiatric casualties is es- 
sentially the same as that of a random sampling of the adult population. 
However, a random sampling of individuals in the naval service who are 
not neuropsychiatric patients might reveal significant differences. Un- 
fortunately, such comparative data are not as yet available. 


Functional Intelligence 


In order to distinguish “original intelligence” from the present intel- 
lectual level of an individual, the term “functional intelligence” is intro- 
duced. The usefulness of the Shipley-Hartford Test as a measure of 
functional intelligence may be determined by correlating its results with 
those on a test whose validity has already been established. For this 
purpose, a sample of 134 of the total group studied were administered 
both the Shipley-Hartford and the Wechsler-Bellevue tests. The total 
scores on the Shipley-Hartford (abstraction plus vocabulary) were cor- 
related with the total scores on the Wechsler-Bellevue.* The resulting 
correlation, r = .77 + .03 is as high as most of the correlations between 
two individual tests of intelligence, and is particularly good in light of 
the many differences between a group and individual test of intelligence 
(as time, range of abilities tested, administration, etc.). Thus, the use 
of the Shipley-Hartford test as a rough, but easily determined, approxima- 
tion of the general intellectual level of the individual, when conditions do 
not permit the use of the more refined individual intelligence examination, 
is supported by these data. 

The results on the distribution of functional intelligence as presented 
in Table 4 show that many of the patients are functioning considerably 
below their original intelligence. Whereas, the mean original intelligence 
score is very close to the upper limit of the average range, the mean 
functional intelligence score of 43 is very close to the lower limit of the 


average range. 


5 The correlation of r = .64 + .03 between the Wechsler-Bellevue Verbal Score and 
the Shipley-Hartford Vocabulary score showed considerable overlap between the two 
tests, but the verbal and vocabulary scores alone seem less adequate as a basis for 
estimating the general level of intelligence. 
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The vocabulary based estimate showed that approximately 75% of 
the total group were either of average or above average intelligence. In 
contrast to this, the total score estimate indicates that only 53% of the 
group are functioning at that level. The difference is most striking in 
the lower intelligence levels. On the basis of “original” intelligence, 13% 


Table 4 


Functional Intelligence of a Group of Neuropsychiatric Patients as Measured 
by the Total Scores on the Shipley-Hartford Test 





Raw Age 
Score % Equivalents IQ Description 


74-80 7 19.6-20.8 Very Superior 
66-73 6.8 18.0-19.5 Superior 

57-65 16.0 16.5-17.9 High Average 
50-56 15.1-16.3 Average 

42-49 13.5-14.9 Average 

35-41 12.1-13.3 Low Average 
25-34 : 10.4-11.9 Borderline 

24 and 10.2 and 

Below ‘ Below Mental Deficiency 








of the group were characterized as being of borderline intelligence or 
mentally deficient. More than twice this number, 32%, are ‘“function- 
ing”’ at these levels. 


Intellectual Efficiency 


The concept of intellectual impairment suggested by the above dis- 
cussion of original and functional intelligence has been treated by Shipley 
in terms of a conceptual quotient (CQ) which is the relationship of the 
abstract to the vocabulary abilities. The theoretical basis for the CQ is 
supported by a number of investigations. 

The ability to formulate a principle which “abstracts’’ common ele- 
ments from a group of events has been found to be a more complex psy- 
chological process than the type of discrimination between two objects 
which is so large a component of vocabulary skill. The ability to think 
in abstract terms or learn abstract relationships has also been shown to be 
a later development in the life of the individual *’ and is more readily 


* Straus, A. A., and Werner, H. Disorders of conceptual thinking in the brain- 
injured child. J. Nerv. Ment. Dis., 1942, 96, 153-172. 

7 Babcock, H. An experiment in the measurement of intellectual deterioration. 
Arch. Psychol., 1930, No. 117, 1-105. 
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disturbed by psychopathological * and organic brain disorders *'%"™ 1 
than the more concrete abilities. 

As the CQ decreases, the likelihood of intellectual impairment becomes 
more probable. The results show that a substantial proportion (62%) 
of the neuropsychiatric group had CQ’s suggesting intellectual impair- 
ment. Using the interpretation of CQ’s offered by Shipley, 10% of the 
patients had “slightly suspicious,’’ 14% ‘moderately suspicious,” 26% 
“very suspicious,” and 12% “probably pathological” evidence of intel- 
lectual impairment. This is consistent with the clinical observations 
that neuropsychiatric patients are frequently unable to utilize adequately 
their intellectual capabilities to make plans, decisions, or deal with their 
problems in general. 


Summary and Conclusions 


The Shipley-Hartford Scale for the Measurement of Intellectual Im- 
pairment was given to 977 randomly selected Neuropsychiatric patients 
at a service hospital. Of this group, 134 were also given the Wechsler- 
Bellevue Intelligence test. The following conclusions seem warranted: 

1. The Shipley-Hartford test can be used as a rough estimate of 
functional intelligence. 

2. The distribution curve of “original” intelligence of neuropsychi- 
atric patients is very similar to that obtained by Wechsler in a sampling 
of the adult population. 

3. A large proportion (62%) of neuropsychiatric patients tend to 
show a lowering of efficiency in their intellectual functioning. 


Received January 12, 1945. 


* Kendig, I., and Richmond, W. V. Psychological studies in Dementia Praecoz. 
Ann Arbor: Edwards Bros., Inc., 1940, 1-166. 

* Goldstein, K., and Sheerer, M. Abstract and concrete behavior; an experimental 
study with special tests. Psychol. Monogr., 1941, 53, No. 2 (Whole No. 239). 

% Hunt, H.F. A practical clinical test for organic brain damage. J. appl. Psychol., 
1943, 27, 375-386. 

"Hunt, H. F. A note on the clinical use of the Hunt-Minnesota test for organic 
brain damage. J. appl. Psychol., 1944, 28, 175-178. 

% Hunt, H. F. The Hunt-Minnesota test for organic brain damage. Minneapolis: 
The University of Minnesota Press, 1943. 








A Social I.E. Scale for the Minnesota Multiphasic 
Personality Inventory 


Lewis E. Drake 
University of Wisconsin 


The Minnesota T-S-E Inventory (1) has been used as a part of the 
standard battery of tests administered to students in the guidance pro- 
gram at the University of Wisconsin for over a year. The inventory has 
yielded data which has been helpful in counseling students. Since, how- 
ever, the Minnesota Multiphasic Personality Inventory (2) is also a part 
of the standard battery and since many items of the latter resemble items 
in the former, it was thought desirable to try to devise keys to score the 
Multiphasic to yield data now obtained by means of the 'T-S-E Inventory. 
This report is limited to results obtained for a Social I.E. scale. Scales 
for Thinking and Emotional introversion-extroversion are not yet ready 
for publication. 


Procedure 


An Item Analysis of the Multiphasic Personality Inventory was made 
by contrasting the percentage responses of two groups of students to the 
items. One group consisted of 50 students who obtained centile ranks 
of 65 and above on the T-S-E Inventory when scored for Social introver- 
sion-extroversion. The second group consisted of 50 students who ob- 
tained centile ranks below 35 on the T-S-E Inventory. The students 
were all females because of the small male population in the University, 
but the scale was validated with a male population as will be shown later. 
There was no other factor used in the selection of cases except that three 
cases were not included because the L scores on the Multiphasic were 
quite high. ' 

Items were selected for the key which showed a difference between the 
percentage responses of the upper and lower groups of at least twice the 
standard error of the difference. Some significant items, however, were 
eliminated because there was an extremely high or extremely low fre- 
quency of response for both upper and lower groups. 

After the item selection had been completed a new group of Multi- 
phasic record sheets were scored with the obtained key for purposes of 
validation. These record sheets contained the responses of a group of 
female students who cleared through the testing office after the group of 
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students who provided the data for the item analyses. The scores ob- 
tained with the new key were then correlated with the Social I.E. scores 
obtained on the T-S-E Inventory. The key was then used for scoring all 
available record sheets for male students, providing there were T-S-E 
scores also available, and these scores were correlated vith the T-S-E 
scores. 

Finally, norms were established by scoring all Multiphasic record 
sheets available. The norms are reported in terms of T scores obtained 
in the customary way, namely: 7 = 50 + 0 ee ¥) where X;-is the 


raw score, and X the mean and S the standard deviation of the raw scores 
for the normative group. 





Results 


The items for this key are listed in Table 1 according to the way they 
are designated on the Multiphasic Record Sheets. The raw score is ob- 








Table 1 
Scoring Key for Social I.E. 
A36 X D50 xX E36 X F8 x His 0 
A37 X D538 xX E38 xX F9 x H51 xX 
A38 xX D4 X E43 xX F30 0 H52 xX 
B6 x Els X E44 X F31 X I21 0 
B22 xX E23 xX E46 X F334 X ae « 
C2 <x E26 xX E47 X F36 X m_ -z 
C25 x E27 =x E49 X —” _—.: S&S 
c4s8 OO E23 xX E52 X F45 X I2g XxX 
C55 xX E29 XxX E55 X Gis X I200 CX 
D2 0 E30 xX F2 x G24 0 I3s8 xX 
Ds «(=O E32 xX F3 i G35 =X 141 x 
D35 «(OO E33 X F4 x G42 0 J24 0 
D37 «OO E34 xX F5 x H2 x J32 0 
D45 O E35 <X F6 x H12 0 J338 =x 





tained by counting one point for every cell on the record sheet having an 
X corresponding to the key and one point for every cell which is blank 
corresponding toaQonthekey. The cells containing question marks are 
not counted (3). 

Twenty-eight of the items on this key have not been used on any keys 
reported by Hathaway and McKinley. 

Record sheets for 87 female students were then scored with this key 
and the scores were correlated with the Social I.E. scores on the T-S-E. 
The resulting coefficient of correlation was —.72. The coefficient was 
negative because the key for the Multiphasic was constructed so that a 
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high score would indicate introversion whereas on the T-S-E a low score 
indicates introversion. 

Record sheets for 81 men students were likewise scored and the scores 
correlated with Social I.E. scores on the T-S-E. The resulting coefficient 
was —.71. Hence the key was used for both male and female students in 
obtaining norms. 

Table 2 gives the T scores for this scale based upon records for 350 
female students and 193 male students. Separate norms were computed 


Table 2 
The T Scores for the Social I.E. Scale 








Raw = Raw = 
Score Score Score Score 


52 79 
51 78 
50 77 
49 76 
48 75 
47 74 
46 73 
45 72 
44 71 
43 70 
42 69 
41 68 
40 67 
39 46 
38 45 
37 19 44 
36 18 43 
35 17 42 
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for males and females, but they were so similar, differing by only 2 from 
raw score 0 to 6 and being identical for most of the range, that the tables 
were combined for both sexes. 


Summary 


1. Using the Social I.E. scores on the Minnesota T-S-E Inventory for 
a group of female students as a criterion, an item analysis of the Minne- 
sota Multiphasic Personality Inventory was made. 

2. The derived key appears to have equally good validity for both 
male and female students. 

3. An attempt is being made to derive Thinking and Emotional I.E. 
scales in a similar way. 


Received July 5, 1945. 
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Interests of Senior and Junior Public Administrators 


Edward K. Strong, Jr. 
Stanford University, California 


Is it possible to differentiate junior and senior public administrators 
on the basis of their interests? By junior and senior administrators we 
mean roughly those earning three to five thousand dollars a year and 
those earning nine thousand and over. 

The Committee on Public Administration of the Social Science Re- 
search Council obtained for the writer 550 Vocational Interest Blanks 
filled out by public administrators who were, in the judgment of the 
Committee, successful administrators. Some of these blanks for one 
reason or another could not be used. A few additional blanks were sup- 
plied from our files. For certain comparisons use has been made of 
blanks of city school superintendents but these blanks have not been in- 
cluded in general summaries respecting public administrators. 

The public administrators have been classified in two different ways. 


First, they were classified into sub-groups according to the function they 
perform, such as, welfare, personnel, taxation, etc. Some of the ad- 
ministrators could not readily be classified in this manner for the reason 
that they direct employees engaged in many different activities, as for 
example, hospital superintendent, city manager and senior official in the 
Department of Agriculture. Under this first classification we have (a) 
functional and (b) general manager sub-groups as listed in Table 1. 


Managerial Responsibility 
The second classification, with which we are primarily concerned in 
this article, relates to the degree of managerial responsibility exercised by 
the administrator. The 550 cases! were assigned as far as possible to 
five classes representing such degrees of managerial responsibility. As 
there were only a few men assigned to the lowest class it was discarded. 
In determining the degree of managerial responsibility the following 
factors were taken into account: i.e., (1) Number of employees, (2) 
Whether employees were all engaged in the same type of work or in a 
1 As shown in Table 1 only 518 cases were used. Of the 550 cases, 10 were obviously 
not administrators, one turned out to be a duplicate, 10 evidently did not earn $3,000, 


and 17 could not be classified. Six cases were added from our files, making actually 
518 cases used from among 556. 
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Table 1 


Classification of Public Administrators according to (1) Function Performed 
and (2) Managerial Rank 














Managerial Rank 
A B C D Total 
Functional Groups 
Personnel 3 2 29 25 59 
Social Insurance 3 1 4 3 ll 
Welfare 5 9 26 8 48 
Taxation 0 0 6 9 15 
Comptroller-Finance 1 4 13 6 24 
Recreation and Parks 0 1 7 5 13 
Office Manager 0 1 2 2 5 
Statistics 1 3 21 3 28 
Public Health 0 8 11 6 25 
Engineer 1 y 14 2 26 
Chemist-Physicist 0 3 11 0 14 
Miscellaneous * 4 8 18 10 40 
General Manager Groups 
Prison Warden 0 0 16 0 16 
Hospital Superintendent 6 17 12 10 45 
Reform School Superintendent 0 0 0 9 9 
City Manager 1 5 8 27 41 
Dept. of Agriculture and 
Commerce and TVA 9 30 13 1 53 
Forest Service 0 11 20 15 46 
Total 34 112 231 141 518 





* Includes 8 Publicity, 6 Law enforcement, 6 Education, 5 Lawyer, and 15 others. 


variety of activities, (3) Whether employees were engaged in relatively 
simple or highly technical work, (4) Whether the position was essentially 
a line or a staff position, (5) Whether the administrator was in charge of 
the unit or was (a) an assistant or deputy administrator or was (b) an 
assistant to the administrator. 

After the majority of public administrators had been assigned to one 
of the four classes it was found that salaries of these men approximated 
the following: Class A $9,000 and up; Class B 7,000 to 8,999; Class C 
5,000 to 6,999; and Class D 3,000 to 4,999. Amount of salary was then 
used as an additional factor in classifying the men. Some of those al- 
ready assigned a class but whose salary was out of line proved to have 
been classified on really insufficient evidence and were reclassified on the 
basis of salary. The same basis was used in cases which had not been 
classified at first, because there was insufficient information concerning 
their work. 
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After making the functional classification and before making this 
managerial responsibility classification the writer corresponded with 
many of the men whose records were incomplete regarding the nature of 
their work. In most cases the desired information was obtained from 
them or from certain officials. In some cases salaries were secured when 
no other information was forthcoming. The additional information in- 
dicated that only a few had been incorrectly classified in terms of function 
and added considerably to our understanding of the relative importance 
of the duties of these men.} 

Positions in some organizations are notoriously underpaid in com- 
parison to other organizations; some younger men carry heavy responsi- 
bilities without commensurate salary and some older men have relin- 
quished managerial responsibility for advisory work without decrease in 
pay. An attempt has been made to take such complications into account. 

For convenience the classification is given in terms of salary but it 
must be recognized that the four classes represent degrees of responsibil- 
ity rather than actual amounts of money received, although in most 
cases the man actually receives the salary of the class to which he is as- 
signed. 

The writer found the most difficult groups to classify were superintend- 
ents of hospitals and of reform schools, and prison wardens. Their 
classification is pretty much a guess. The writer is certain now that he 
was too much influenced by salary received by hospital superintendents. 
None of them it seems should have been assigned to Class A considering 
the calibre and scope of work performed by others in this class. 

It is quite likely that some men have been assigned to the class above 
or below that to which they properly belong but we doubt if any man has 
been assigned to a class two steps above or below the class to which he 
belongs. Considering the complexities of the task and the lack of de- 
tailed information in many cases, the writer believes his classification is 
good. It is, in his opinion, much better than he believed possible at the 
beginning of the study. 

Table 1 gives the number of public administrators from each func- 
tional group that was assigned to Classes Ato D. Since the four classes 
are composed of different proportions of these functional groups and 
since such functional groups differ appreciably in interests it was feared 
that the summaries based on these four classes would be unduly influ- 
enced by the uneven representation. Mean scores for each class were 
calculated in the usual way and also by weighting the groups proportion- 
ately. Approximately the same results were obtained by both methods. 
We have used, however, the weighted means, except in the calculation of 
critical ratios of differences between means. 
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Interests of Senior and Junior Administrators 


The mean scores on 34 occupational interests are given in Table 2 
for Classes A and D. The data for Classes B and C are not published 
for their scores fall between those of A and D in 52 out of 70 cases. Such 
a relationship is illustrated by the artist interest scale, where Class A 
scores 26.2 on this scale; Class B scores 23.2; Class C, 22.2; and Class D, 
20.7. In 17 of the remaining 18 cases the scores of Groups B and C devi- 
ate from those of A or D by less that 1.6 score. The greatest exception 
is in the case of city school superintendent interest, where Class A scores 
36.0 on this scale; Class B scores 33.7; Class C, 32.3; and Class D, 34.9. 
This is the only case among 70 where the score of either Class B or C falls 
outside the scores of Groups A and D by an amount approximating a 
statistically significant difference (critical ratio of 2.4). Consequently 
we are justified in assuming that as far as interests go the four groups 
constitute a continuum and that differences between Groups A and D 
reflect differences in the four groups. The intercorrelations between the 
interest profiles of the four classes, as given in Table 4, further support 
this statement. (Such an array of data affords good evidence that the 
classification into the four groups has real merit.) 

The fact that the scores in Class B fall so uniformly between Classes 


Table 3 


High Interest Scores of Four Classes of Administrators and of Presidents of 
Manufacturing Concerns 








A Ratings B+ Ratings B Rati 
Group 45 and up 40 to 44 35 to 





Class A Lawyer Personnel manager 
President 
Advertiser 
Journalist 


City School Supt. 


Class B Personnel manager Production manager 
Lawyer 


Class C Personnel manager 


Production manager 
Lawyer 


Class D Personnel manager Production manager 
Math.-Science teacher 
City School Supt. 


President President Production mgr. Sales manager 
Realtor 
Purchasing agent 
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A and D and that they differ very little from Class A justifies us in 
basing conclusions on Class A to a greater extent than is warranted by 
the small number of cases included in it, i.e., 34. 

When only the high interest scores in Table 2 are considered we have 
them as shown in Table 3. We have included here the corresponding 
data for presidents of manufacturing concerns for comparison. The 
data in the table concerning Classes A and D and presidents are not too 
easily appreciated but as soon as these scores are shown on the interest 
globe (Fig. 1) their significance becomes apparent. The strongest inter- 
ests of senior administrators are in president of Group XI, advertiser, 
lawyer and journalist of Group X and personnel manager and city school 
administrator of Group V, all located near the top of the right-hand figure. 
The strongest interests of Class D are in production manager of Group 
III, math.-science teacher of Group IV and personnel manager and city 
school superintendent of Group V, located across the bottom of the same 
figure and extending upwards at the right to overlap with the interests 
of Class Ain Group V. The strongest interests of presidents are at the 
left hand side of the figure including realtor and sales manager of Group 
IX, purchasing agent of Group VIII, president in which they overlap 
with Class A, and production manager in which they overlap with Class 
D. Each of the three have some strong interests in common with the 
other two and each-has some interests peculiar to itself.” 

The rank-order correlations between the interest profiles of Classes 
A and D and president are: Class A and Class D = .43; Class A and 
president = .59; and Class D and president = .39. These are low co- 
efficients as this type of correlation goes (see Table 4). They are about 
equal to the correlations between the personnel and publicity functional 
groups and between city school superintendents and forest service ad- 
ministrators, whose interests differ appreciably. As far as the three 
coefficients go they suggest that senior administrators and presidents are 
more similar in their interests than either is similar to junior administra- 
tors. 

If we include in our comparison not only the high ratings of A, B+ and 
B but also the B-- ratings we can add that senior administrators have 
more the interests of presidents, of men engaged in selling and influencing 
people and of scientists, whereas junior administrators have more the 
interests of social workers, production managers, general office people and 
skilled workmen. On the same basis we can say that senior administra- 
tors differ from presidents by having more of the interests of men engaged 
in social work and in influencing people but not in selling them; by having 


2 The strongest interests of both Casses B and C are in personnel manager, produc- 
tion manager and lawyer having interests which fall between those of Classes A and D. 
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more the interests of scientists, particularly of psychologists and en- 
gineers; and of public accountants but not of office people, including 
accountants; and by having less of the interests of production managers 
and of presidents. 

Several side lights regarding the four classes of administrators and 
how their interests are related to those of functional and general manager 


Table 4 


Rank-Order Correlations Between Interest Profiles of Classes A, B, C and D and of 
Certain Groups of Public Administrators and Business and Professional Men 








Classes 
Cc 








69 
92 


87 
Functional Groups 
Personnel ’ ; 87 
Recreation é ‘ .58 
Statistics é r .69 
Law ‘ 4 32 
Chemist-Physicist . ; 32 
General Manager Groups 
Prison Warden / y 88 
City Manager é ‘ 82 
District Ranger 21 ; 31 
Forest Supervisor .29 P 73 
Forest Serv. Administrator 87 d .74 
Dept. of Agriculture Admin. 86 ; 85 
Dept. of Commerce Admin. 61 ; .74 
Business and Professional Men 
President 59 é 67 
Production Manager 37 d 71 
Personnel Manager 61 ) 83 
Engineer 46 d 46 
Lawyer 81 ‘ .65 


BSSE8 F3RR8R3 





groups are brought out in Table 4. The correlations in this table are all 
between interest profiles of different groups. The first four rows of co- 
efficients show that Classes A, B, C and D stand in the order of a con- 
tinuum, as previous data indicated, that Classes B and C are most closely 
related (coefficient of .92); second, C and D (.87); and third, A and B (.84); 
and that A and D are not closely related (.43). 

Both the personnel functional group and personnel managers from 
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industry have interest patterns which agree particularly well with Classes 
B and C, although 91% of the personnel functional group are assigned to 
Classes C and D, and both these personnel groups correlate about .61 
with Class A. On the other hand, the recreation group correlates much 
higher with Class D (.61) than with Class A (.21). The statistics, law 
and chemist-physicist functional groups correlate to about the same de- 
gree with all four classes, the coefficients being in the sixties for statis- 
ticians, and quite low for the other two groups. 

Among the three groups of forest service men, the interests of district 
rangers are little related to any of the four classes, but more closely re- 
lated to Class D (.37) than to Class A (—.21); the forest supervisors cor- 
relate highest with Class C (.73) and the administrators correlate highest 
with Classes A and B (.87). Relationships such as these are seemingly 
what should be expected and aid one in understanding what the other 
coefficients mean. 

The interests of prison wardens and city managers are like those of 
forest supervisors in being more closely related to Class C than to the 
other three classes. 

Eleven of the 34 men assigned to Class A are from the Department of 
Agriculture, 3 from the Department of Commerce and none from the 
Forest Service (which is a part of the Department of Agriculture but 
which has been kept as a separate group in this study). Nevertheless 
the Forest Service administrators appear to have interests more related 
to Class A than any other group (.87) with the Department of Agriculture 
practically tied with them (.86) and the Department of Commerce less 
closely related (.61). 

Among business and professional men it is the lawyer who has inter- 
ests most closely related to Class A (.81).* Presidents’ interests are 
more closely associated with Class B (.68) and production managers’ 
and personnel managers’ interests correlate highest with Class C. The 
engineers’ interests are little related to any class just as is true of the 
functional group of chemists-physicists. 

It seemed strange to the writer that the interests of Department of 
Commerce men should differ so much more from Classes A, B, C and D 
than the interests of men in the Department of Agriculture. Since the 
former include 24 men assigned to the statistics functional group and 
since these men have a rather peculiar assortment of interests it occurred 
to us that the rather low correlation might be caused by the presence of 
the statisticians. Accordingly correlations were calculated between the 
remainder of the Department of Commerce group, omitting the statisti- 


* All our data make clear that the five lawyers constituting our law functional group 
differ from lawyers in business. 
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cians, and Classes A, B, C and D. The latter four coefficients average 
.09 lower than the original coefficients given in Table 4. So it is not the 
presence of statisticians in this department that causes it to have inter- 
ests less like Classes A, B, C and D than do the Department of Agricul- 
ture. The data in Table 5 indicate that the interests of the Department 
of Commerce administrators are more like presidents, production man- 
agers and engineers than the Department of Agriculture men and less 
like personnel managers and lawyers than the latter. 


Table 5 


Rank-Order Correlations Between the Interest Profiles of Business Men and of 
Administrators, from the Departments of Agriculture and Commerce 








Dept. of Dept. of 
Agriculture Commerce 


President A7 
Production Manager 38 
Personnel Manager 63 
Engineer 31 
Lawyer 73 








Differentiation of Junior and Senior Administrators 


The preceding data indicate that junior and senior administrators 
differ appreciably in their interest. It is not easy, however, to express the 
differences between the two by such complex relationships as are por- 
trayed in Figure 1. Are there any short cuts that may be used for this 
purpose? 

First of all let us note that such differentiation can not be obtained 
by use of the public administrator interest scale. The mean scores of the 
four classes of public administrators on the public administrator scale 
are: A, 52.1; B, 49.7; C, 49.3; and D, 48.8. Not even the difference in 
mean scores of Classes A and D is statistically significant (critical ratio 
of only 1.9). Differences in success-failure, or in this case, differences in 
senior-junior administrator standing, are not likely to be revealed by a 
scale based on both groups.‘ 

From Table 2 we note that there are four scales on which senior ad- 
ministrators score significantly higher than junior administrators, that 
is, by scores of 6 or more. These scales are president, journalist, ad- 
vertiser and lawyer. And there are ten scales on which senior admini- 
strators score lower than junior administrators. One way to decide 


‘ Scores of sub-groups of public administrators on the public administrator scale are 


given in: E. K. Strong, Jr., Interests of public administrators, Public Personnel Re- 
view, 1945, 6 166-173. 
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_ whether a man is a senior rather than a junior administrator it to note if 
he scores 35 or higher on the president, journalist, advertiser and lawyer 
scales and if he scores lower than 30 on the Y. M. C. A. physical director, 
accountant, office worker, aviator, farmer, math.-science teacher, printer, 
policeman and forest service scales, and lower than 20 on the carpenter 
scale. 

Reference to Figure 1 will show the location of the four and ten oc- 
cupational interests on the interest globe. The four comprise Groups X 
and XI near the top of the globe and the ten comprise Group IV and 
half of Group VIII which are near the bottom of the globe. Further 
reference to the figure and Table 2 will disclose that to a high degree as 
one goes from occupational interests at the top of the globe to those at 
the bottom one goes from large plus to large minus differences in score 
between senior and junior administrators. This relationship is ex- 
pressed extremely well by scores on the OL (occupational interest level) 
scale. 

Differentiation by OL Scale. The last column in Table 2 gives the 
mean OL score of the criterion groups upon which the 34 occupational 
scales are based. It will be noted that the four scales upon which senior 
administrators score significantly higher than junior administrators have 
OL scores ranging from 63.0 to 64.4 (average of 63.7) and that the ten 
scales on which senior administrators score significantly lower than 
junior administrators have OL scores ranging from 48.5 to 59.5 (average 
of 54.0). If we correlate all the differences in occupational level interest 
scores between senior and junior administrators (column four of Table 
2) with the corresponding OL scores we obtain a rank coefficient of .84. 

: Evidently the OL scale measures the interests which differentiate senior 
and junior administrators to a very considerable degree. 

The OL interest scale contrasts the interests of business and profes- 
sional men, typifying the upper socio-economic level, with the interests 
of common laborers, typifying the lower socio-economic level. Mean 
scores on this scale for 34 occupations are given in Table 2. The hier- 
archy of occupations based on interests is quite similar to the hierarchy 
based on intelligence test scores.® 

The mean OL scores of the four classes of public administrators are 
shown in Table 6, with corresponding occupational groups. 

The difference between the OL scores of Groups A and D, amounting 
to 6.0, does not appear large but it represents 30% of the entire range 
from the highest to the lowest socio-economic levels and has a critical 
ratio of 4.8. The data indicate that senior administrators score as high 


SE. K. Strong, Jr. Vocational interests of men and women. Stanford University 
Press, 1943, Chapter 10. 
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as any occupation so far tested and that junior administrators, as a group, 
belong to a lower socio-economic level. 

If it could be shown that OL scores increase with age then it would be 
easy to explain the fact that Class A averages higher than Class D as 
Class A is composed of older men than Class D. Unfortunately for such 
a hypothesis the data so far accumulated indicate practically no increase 
in OL score with age. Table 7 gives the mean OL scores of Classes A to 
D for ages ranging between 25 and 70 years. There is here no evidence 
of increase in OL score with age. 

Some of the men now in Classes C and D will eventually move into 
Classes B and A. But as far as the interests measured by the OL scale 


Table 6 
Mean OL Scores for Four Classes of Public Administrators 








Public Ad- Mean OL Occupations with Com- 
ministrators Score parable Score 





Lawyer 
Class A 64.9 ee + 
Life insurance 
Mathematician 
Personnel manager 


Class B 


=28 SF 


Class C 


Chemist 
Purchasing agent 


Ss 


Accountant 
Minister 
Banker 


Class D 


mote COW POO Pe 


| 
{Chemie manager 
| 


SHS 





go it is evident that a considerable number in Classes C and D cannot 
have scores as high as men in Classes A and B or there would not be the 
differences in mean scores which are given above. The actual distribu- 
tion of OL scores for the four classes is given in Table 8. Assuming that 
senior administrators should not score below 55 on this scale we have: 
5.9% of A administrators scoring below 55; 10.0% of B administrators 
scoring below 55; 24.7% of C administrators scoring below 55; and 25.0% 
of D administrators scoring below 55. In terms of overlapping between 
sub-group A and the other three sub-groups we have: 86.2% of B over- 
lap with A; 73.8% of C overlap with A; and 67.1% of D overlap with A. 
On the basis of OL interest scores we can roughly estimate that between 
a fourth and a third of Group D do not have the interests characteristic 
of senior public administrators. Unfortunately we do not have similar 
comparisons in terms of ability to compare with these calculations in- 


* Ibid., Chapter 10. 
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volving interests. Such a statement does not imply that this minority 
are necessarily in the wrong type of work. Many Class D activities are 
sufficiently different from Class A activities to demand men of different 
interests and abilities for their successful handling. 

Differentiation on the Lawyer Scale. Mean scores of the four classes of 
public administrators on the lawyer interest scale are as follows: A, 42.2; 
B, 36.2; C, 34.3; and D, 33.7. The critical ratio of the difference be- 
tween senior and junior administrators is 4.4, slightly less than the criti- 
cal ratio of 4.8 on the OL scale. The data suggest, however, that the 
lawyer scale differentiates Classes A and B better than the OL scale as 


Table 7 
Distribution of OL Scores of Four Classes of Public Administrators According to Age 











Classes 





> 
ie) 
Q 
o 
3 
eS 





Age N Mean N Mean N Mean N Mean N Mean 
70 A: £-i..AR 
65 i. & 3 62 8 65 4 56 16 62 
60 2 69 9 62 14 60 6 55 31 =s«61 
55 4 66 19 64 26 = «61 16 «8657 65. 61 
50 7 62 18 64 46 59 22 «58 93 #8660 
45 Tv A 26 «662 44 59 20 #4260 97 ~=«61 
40 9 66 22 #8662 40 60 27 = 62 9 61 
35 2 63 11 60 17 58 25 59 55 59 
30 2 8&7 1 62 23 «sé 17 +58 43 58 
25 1 62 3 67 3 «658 7 62 
Total Cases 34 110 222 140 506 
Mean 65 62 60 59 60 
Cases with 
no 
vt 2 9 1 12 





the corresponding critical ratios are 3.0 and 2.3 respectively, but the 
reverse is the situation between Classes C and D where the respective 
critical ratios are 0.6 and 1.5. Similar results are to be expected from 
the lawyer and OL scales since they correlate .60. 

It might be supposed that this differentiation between the four classes 
in terms of lawyer interest is due to the fact that there are more legally 
trained men in Class A than Class D. The facts, however, do not 
justify this hypothesis. Of the 333 men who have supplied sufficient 
information about their education to determine what field they specialized 
in there are only 46 men reporting legal training. (We do not here count 
one or two courses in the subject but rather a preparation that would 
lead to legal practice if the man so desired.) This amounts to 13.8%. 
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The percentage is 12.3 for men in Classes A and B and 14.9 in Classes C 
and D. Consequently increase in lawyer interest scores from Class D 
to Class A is not attributable to increase in legal training from Class D 
to Class A. 

Naturally the various functional sub-groups vary in mean lawyer 
score and in percentage of members with legal training. There is, how- 
ever, some association between the two sets of data (correlation of .61). 
It is difficult to say whether a functional group has a high mean lawyer 
interest score because it has more than its share of men with legal train- 
ing or that the type of work performed attracts men with interests of 


Table 8 


Distribution of OL Scores of Four Classes of Public Administrators. 
Figures are Percentages 








OL 
Score A B 





80 

75 1.8 ‘ 
70 10.9 ; 8.6 
20.9 ; 13.6 
35.4 . 24.3 
20.9 : 28.6 
9.1 : 17.9 
0.9 ’ 4.3 
1.4 
0.7 
0.4 0.7 


65 
60 
55 
50 
45 
40 
35 
30 


64.9 61.9 60.1 58.9 
6.9 6.1 7.7 74 
34 112 231 141 


Z 





* Critical ratios of differences in mean scores are as follgws: AB, 2.3; AC, 3.8; AD, 
4.8; BC, 2.4; BD, 3.5; and CD, 1.5. 


lawyers and so men with legal training are more apt to be found in that 
work than in other activities. 

Differentiation on the President Scale. Mean scores of the four classes 
of public administrators on the president scale are as follows: A, 38.2; 
B, 34.1; C, 33.3; and D, 31.2. In terms of critical ratios, senior and 
junior administrators are not differentiated as well as on the lawyer and 
OL scales, the three critical ratios are respectively 4.0, 4.4 and 4.8. 
Classes A and B are differentiated as well on the president scale as on the 
OL scale (C.R. of 2.3) but not as well as on the lawyer scale (C.R. of 3.0). 
The president scale, however, differentiates Classes C and D better than 
the other two scales, critical ratios, respectively, of 2.0, 0.6 and 1.5. 
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Summary and Conclusion 


Over 500 public administrators have been classified on the basis of 
the managerial responsibility exercised by them. The four classes earn 
approximately $3,000 to $4,999; $5,000 to $6,999; $7,000 to $8,999 and 
$9,000 and above. The first and fourth classes are referred to here as 
junior and senior public administrators. 

The interests of senior and junior public admini ‘rators differ some- 
what,—enough in fact to suggest that a fourth to a third of junior ad- 
ministrators do not have the interests of senior administrators. Such 
evidence us we have does not warrant the belief that the interests of the 
junior administrators will change with increasing age in the direction of 
senior administrators. 

Senior public administrators and presidents of manufacturing con- 
cerns have interests more in common than has either with junior admini- 
strators. But senior administrators and presidents differ appreciably. 
The former have more than the latter of the interests of men engaged in 
social work and in influencing people but not in selling them, have more 
the interests of scientists, particularly of psychologists and engineers, 
and of public accountants and less of the interests of general office people 
including accountants, of production managers and of presidents. 

Senior administrators differ from junior administrators by having 
more the interests of presidents, of men engaged in selling and influencing 
people and of scientists, whereas junior administrators have more of the 
interests of social workers, production managers, general office people and 
skilled workmen. 

Senior and junior administrators differ significantly in their scores on 
fourteen occupational interest scales. It is possible that the two groups 
could be well differentiated by using some weighting system applied to 
scores on these scales. A general summary of such differences is meas- 
ured by the occupational level (OL) scale, on which the two groups are 
differentiated by a critical ratio of 4.8. They are also differentiated 
significantly by the lawyer and president scales, which correlate with OL, 
by .60 and .63, respectively. 

The two groups of administrators are not differentiated by scores on 
the public administrator scale which is based on their recérds and on 
those of the two intermediate classes of administrators. Such a scale 
measures the differences in interests between public administrators as a 
whole and men-in-general, representative of the upper socio-economic 
level. Degree of possession of managerial responsibility cannot be meas- 
ured to any appreciable degree by such an interest scale. 

Degree of success-failure is quite another measure from that of degree 
of possession of managerial responsibility. It cannot be determined 
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satisfactorily any more than the latter on an occupational interest scale. 
If success-failure is to be measured in terms of interests it must be ac- 
complished by contrasting the interests of men who are successful with 
the interests of men who are not successful. Almost nothing has been 
done in this area so we do not know what are the possibilities. 

Extensive data regarding men in the Forest Service indicate that 
as one goes upward from district ranger to the top administrative posi- 
tions there is a progressive decrease in the interests typical of rangers 
and an equally progressive increase in the interests of administrators.’ 

Top administrators perform different work from that of lower officials 
and evidently possess somewhat different interests. Selection of men 
should be on a different basis for the top and bottom levels. As only a 
small percentage of men at the bottom ever reach the top it is necessary 
to select only a small percentage of men entering a profession who have 
the characteristics of men at the top. The remainder can be selected 
for the work they are to peiform in the lower and middle levels. It seems 
obvious that there should be some other way of rewarding competent 
men who perform the middle level jobs sucessfully than by promoting 
them into administrative work for which neither their interests nor 
abilities fit them. 

At the moment the OL scale seems to measure the differences in inter- 
ests of junior and senior administrators as well as any scale. Further re- 
search is needed to substantiate this statement. Efforts should be made 
to see if some revision of the present OL scale may not perform this serv- 
ice even better. Comparison should be made between scores on an in- 
telligence test and OL scores to see which are more useful here, or 
whether some combination of the two is better than either one alone in 
differentiating degrees of administrative ability. 

Received January 29, 1945. 


7 E. K. Strong, Jr. The interests of Forest Service men, Educational and Psycho- 
logical Measurement, 1945, 5, 157-171. 























A Comparison of the Thurstone and Likert Techniques of 
Attitude Scale Construction 
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This is a study in the methodology of attitude measurement; a com- 
parison and evaluation of two methods of attitude scale construction. 
Although various techniques for the measurement of social attitudes have 
been suggested,! the two most frequently used methods are probably the 
“‘method of equal appearing intervals’ developed by Thurstone and 
Chave (18) and the ‘‘method of summated ratings” * developed by Likert 
(11). This study is concerned with the relative merits of the Thurstone 
and Likert techniques of scale construction. 


Method of Equal Appearing Intervals 


The method of equal appearing intervals begins with the collection of 
a variety of statements of opinion toward a particular issue which are then 
screened and edited in accordance with certain “informal criteria.”’ 
Statements which appear to represent past rather than present attitudes 
are discarded or re-worded, as are statements which appear to be double- 
barreled * or which contain confusing or ambiguous concepts. Inspection 
should also exclude statements which might be approved by individuals 
with opposed attitudes.‘ 

After the statements have been edited, they are then presented to a 
group of judges who are instructed to sort them into various categories 
to represent a scale ranging from extremely favorable, through neutral, 


1 Excellent summaries can be found in Albig (1, pp. 181-213), Bird (3, pp. 149-167), 
LaPiere and Farnsworth (10, pp. 397-399), and Murphy, Murphy, and Newcomb 
(13, pp. 891-912). 

* The term “method of summated ratings” was introduced by Bird (3, p. 159) to 
describe the procedure followed by Likert. 

* But in this connection see the discussion by Edwards (4, p. 578) which points to 
the possible value of statements which contain a “rationalization’’ clause. 

‘A more detailed enumeration of the rules to be followed can be found in the mono- 
graph by Thurstone and Chave (18, pp. 56-58). 
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to extremely unfavorable expressions of opinion about the issue or institu- 
tion in question. The judges are not asked to give their own opinions, 
but merely to estimate the degree of favorableness or unfavorableness 
expressed by each statement. When the sorting procedure is completed, 
tabulations are made indicating the number of judges who placed each 
item in each category. From these data accumulative proportions are 
computed for each item and ogives are constructed. Scale values of the 
individual items are then read from the ogives, the value of each item 
being that point along the base line, in terms of scale value units, above 
and below which 50 per cent of the judges placed the item. 

A statistical criterion of the ambiguity of the items is provided in 
terms of the width of the range between the points on the scale marking 
off the 25th and 75th percentiles. This distance is called the Q value. 
A small Q value indicates that the middle 50 per cent of the judgments 
spread over a relatively small range or, in other words, that there is a 
good deal of agreement among the judges as to where the item belongs on 
the scale. A large Q value indicates lack of agreement among the judges 
and, indirectly, that something is probably wrong with the wording of the 
statement. 

Items are selected for the final scale on the basis of the computed 
scale and Q values. An attempt is made to select about 20 or 22 items 
with low Q values and with scale values falling at relatively equally- 
spaced distances along the continuum. Two comparable forms of the 
scale, in terms of scale and Q values of the items included in each form, 
are constructed. The two forms are then given'to a new group of sub- 
jects who are asked to check those statements with which they agree. 
The score for the individual subject is the mean or median scale value of 
the items which he has checked as being those with which he agrees. 
Reliability of the scales is found by correlating scores on the two forms 
of the scale. 


Method of Summated Ratings 


The method of summated ratings also calls for a collection of various 
statements of opinion which are then edited in accordance with informal 
criteria similar to those used in the method of equal appearing intervals.® 
After the elimination and editing of items failing to meet the prescribed 
standards, the remaining statements are presented to a group of subjects 
who are asked to respond to each one in terms of their own agreement or 
disagreement with the statement. Usually a 1 to 5 scale of response is 
used; subjects check whether they strongly agree, agree, are undecided, 


5 See Murphy and Likert (12, pp. 281-283) and Rundquist and Sletto (16, pp. 6-8) 
for a discussion of these standards. 
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disagree, or strongly disagree with each statement A score is given for 
each item depending upon the response made. The five possible re- 
sponses may be weighted 1-2-3-4-5 or 5-4-3-2-1.° Either 1 or 5 is con- 
sistently favorable or unfavorable, although the continuum is reversed in 
about half the statements. That is, about half the statements are 
worded so that a strongly agree response indicates a favorable reaction to 
the issue in question, while the other half of the statements are worded so 
that a strongly agree response indicates an unfavorable reaction. The 
score for the individual subject is the sum of all scores for the separate 
items. 

In selecting items for the final scale, a criterion of internal consistency 
is used. Criterion groups consisting of the upper and lower 10 (or some 
other) per cent of the subjects in terms of total scores are compared to find 
whether the individual items will differentiate between the two groups. 
The means of the upper and lower groups for each item are found; items 
which show the largest difference between the means of the two groups 
are retained in the final scale. 

Scales constructed by the method of summated ratings usually con- 
tain about 20 to 25 items, although Hall (8) has used scales with as few 
as 5, 7, and 10 items. Reliability of the scales is found by the split-half 
method of correlating scores for the odd versus even items. 


Criticisms of Thurstone’s Method 


The monograph by Thurstone and Chave, describing in detail the 
method of equal appearing intervals for measuring attitudes, appeared in 
1929. By the time Likert’s monograph describing his technique appeared 
in 1932, the Thurstone procedure was generally recognized as a major, if 
not the most important, development in the field of attitude scale con- 
struction. It is important, therefore, if we are to compare the two 
methods, to examine the motivation behind Likert’s departure from the 
by then already well-established Thurstone technique. Some indication 
of this is given by the following quotation from Murphy and Likert ’: 

“‘A number of statistical assumptions are made in the application of 
his (Thurstone’s) attitude scales—e.g., that the scale values of the state- 
ments are independent of the attitude distribution of the readers who 


6 This is a simplified method of scoring which was found to correlate .99 with the 
more complicated sigma method first used. 
: 7 We quote from Murphy and Likert rather than from Likert’s original report be- 
cause the Murphy and Likert publication is probably more readily available to the 
interested reader and since it contains, with but few corrections or omissions, the mate- 
rial originally reported by Likert and, in addition, a more detailed report of applications 
of scales constructed by the technique. The passage quoted, with but minor changes, 
is the same as that appearing in Likert (11, p. 6). 
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sort the statements—assumptions which, as Thurstone points out, cannot 
always be verified. The method is, moreover, exceedingly laborious. 
It seems legitimate to inquire whether it actually does its work better 
than the simpler scales which may be employed, and in the same breath 
to ask also whether it is not possible to construct equally reliable scales 
without making unnecessary statistical assumptions” (12, p. 26).® 

The main contentions of Murphy and Likert regarding the method of 
summated ratings seem essentially to be: (1) “it avoids the difficulties 
encountered when using a judging group to construct the scale” (12, p. 42); 
(2) “the construction of an attitude scale by the sigma method ° is much 
easier than by using a judging group to place the statements in piles from 
which the scale values must be calculated”’ (12, p 43); (3) “it yields 
reliabilities as high as those obtained by other techniques with fewer 
items” (12, pp. 42-43); (4) it gives results which are comparable to those 
obtained by the Thurstone technique." More generally, the method of 
summated ratings “seems to avoid many of the shortcomings of existing 
methods of attitude measurement, but at the same time retains most of 
the advantages present in methods now used” (12, p. 42). These claims, 
it should be noted, have been vigorously cuntested, notably by Bird (3) 
and Ferguson (7). For our part, we shall, in the sections which follow, 
attempt to evaluate them in the light of available evidence. 


Influence of the Judging Group 


Several studies cast doubt upon Murphy and Likert’s criticism that 
the attitudes of the judging group may influence the scale values of items 
when the method of equal appearing intervals is used. Using various 
approaches to the problem, these studies (5), (9), (15), seem to be in 
agreement that the attitude of the judging group is not a seriously dis- 
turbing factor. Hinckley’s study (9), in particular, is clear cut. Groups 
of white students with differing attitudes toward the Negro were asked 
to sort items expressing opinions about the Negro. A high positive cor- 
relation was obtained between the scale values assigned to the items by 
the white students who were favorable and by those who were unfavorable 
in attitude toward the Negro. A high positive correlation was also ob- 
tained between scale values derived from judgments of an antagonistic 
white group and from the judgments of a group of Negroes. 


*The method of summated ratings also makes certain statistical assumptions as 
Murphy and Likert recognize (12, pp. 26 ff.). Cf. also the paper by Ferguson (6). 

* Later replaced by the even simpler 1 to 5 method. 

” We have failed to find a specific statement to this effect, yet the idea seems implied 
in the paragraph quoted above. 
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Simplicity of the Likert Method 


Investigators who have used the Likert method seem to be in agree- 
ment that it is simpler than the method of equal appearing intervals. 
Hall reports that he used the method of summated ratings in his survey 
of the attitudes of employed and unemployed men “because of its relative 
simplicity and because it yields scales of high reliability” (8, p. 6), and 
Rundquist and Sletto, who used the Likert technique in constructing the 
Minnesota Survey of Opinions scales, agree that “it is less laborious than 
that developed by Thurstone” (16, p. 5). This evidence is, of course, in 
the nature of authoritative opinion, and Bird has raised some rather 
pertinent objections concerning it. 

“Will the experimenter spend more time, too, in scoring every item 
and summating them in these long scales than another might spend 
determining the mean or median value by the Thurstone technique? 
Then too, is it actually less time-consuming to validate items in terms of 
selected groups than to determine the Q values from a curve or a distri- 
bution of scores? The claim of greater or lesser laboriousness seems to 
have been put forward without due regard for all processes in scaling 
techniques; but, in the interest of constructing refined measuring instru- 
ments, time can be neglected. There is much to be said in favor of a 
psychologist’s refining his instrument before actually applying it to ex- 
perimental groups. The argument that the method of summated ratings 
is less laborious limps badly’’ (3, p. 161). 

Bird’s points are well taken, particularly in the case of scales con- 
structed by the method of summated ratings which contain more than, 
let us say, 25 items. But most investigators who have used the method 
of summated ratings have not found any need for the “long scales” to 
which Bird is objecting. After our own experience in constructing both 
Likert and Thurstone scales, we are inclined to agree with other investi- 
gators that scales can be constructed by the method of summated ratings 
more quickly and with less labor than by the equal appearing interval 
method. We found, for example, that construction of the Thurstone 
scales required about twice as much time, exclusive of the time spent by 
the judging group in sorting the items, as did the Likert scale. It is un- 
fortunate that this is but an estimate and that our records do not permit 
&@ more precise statement of the time factor—a point that should be 
checked in future research and reported. 


Reliabilities of the Two Methods 


A note of confusion has centered around the subject of reliability 
largely as a result of Likert’s study of the reliability of a Thurstone-type 
scale which was scored by both his and the Thurstone technique. The 
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scale was given to a group of subjects with instructions to check the items 
in accordance with the usual Thurstone instructions. The same scale 
was then given to the subjects with instructions to check for each item 
one of the five alternatives (strongly agree, agree, undecided, disagree, 
strongly disagree) in accordance with the usual Likert instructions. Four 
of the items on the Thurstone scale were not adaptable to Likert-type 
responses and were omitted when the subjects were asked to check their 
reactions according to the method of summated ratings scoring system. 

The reliability coefficient between the two forms of the scale (22 
versus 22 items), when scored by the Thurstone method, was .88 (cor- 
rected). The reliability coefficient for the two forms (18 versus 18 items) 
as scored by the Likert method was .94 (corrected). What this demon- 
strates, of course, is that it is possible to take a scale constructed by the 
Thurstone technique and to apply to most of the items the Likert method 
of scoring. But one critic seems to think that because of this finding, 
Likert erroneously concluded that “his technique is the better one” (7, 
p. 52). The higher reliability coefficient obtained by the Likert method 
of scoring, he adds, may be due to the fact that “increasing the number of 
steps in a psychological scale increases reliability’”’ (7, p. 52). As a 
matter of record, this is precisely the same explanation offered originally 
by Murphy and Likert (12, p. 55 and p. 47) for the higher reliability co- 
efficient obtained by the 1 to 5 method of scoring. The entire discussion, 
pro and con, on this point, it seems to us, has little bearing upon the 
question of whether the method of summated ratings or the method of 
equal appearing intervals will yield scales of higher reliability. The real 
problem concerns the reliabilities of scales constructed by the two meth- 
ods, not the reliability of a particular scoring scheme isolated from the 
technique of scale construction of which it is a part.. And on this question 
there is ample evidence. 

Ferguson has quoted Thurstone as reporting the reliabilities of scales 
constructed by the method of equal appearing intervals, under his editor- 
ship, as being “‘all over .8, most of them being over .9” (6, p. 670). We 
do not know whether these coefficients are for scales of 20 or 40 items, but 
Ferguson mentions that in his own studies he has found reliabilities for 
Thurstone scales ranging from ‘‘.52 to .80 for the 20-item forms and from 
.68 to .89 for the 40-item forms” (6, p. 670). If we take these coefficients 
as representative, how do they compare with those reported for scales 
constructed by the method of summated ratings? 

Murphy and Likert found reliability coefficients for their Internation- 
alism Scale of 24 items ranging from .81 to .90." Their Imperialism 


" The reliability coefficients of the Likert scales are based upon split-half correla- 
tions, and all of those reported here have been “corrected” to indicate the reliability 
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Scale of 12 items gave coefficients ranging from .80 to .92; the Negro 
Scale of 14 items yielded coefficients ranging from .79 to .91 (12, p. 48). 
Rundquist and Sletto report coefficients ranging from .78 to .88 for various 
scales of 22 items each (16, p. 110). 

That Likert-type scales with even fewer items will give high reliability 
coefficients is indicated by Hall. Reliability coefficients for his religious 
scale of 10 items ranged from .91 to .93; for the scale of 7 items measuring 
attitude toward employers the coefficients ranged from .77 to .87; and the 
morale scale of 5 items gave coefficients of .69 to .84 (8, p. 19). All of 
these coefficients compare favorably with those obtained from scales con- 
structed by the method of equal appearing intervals. According to the 
evidence at hand, there is no longer any reason to doubt that scales con- 
structed by the method of summated ratings and containing fewer items 
will yield reliability coefficients as high as or higher than those obtained 
with scales constructed by the Thurstone method. 


The Need for a Judging Group 


The confusion which followed Likert’s re-scoring of a Thurstone-type 
scale by the 1 to 5 method, unfortunately, has not been confined to the 
subject of reliability; it has spread to involve the question of whether or 
not there is need for a judging group in the construction of attitude scales. 
Ferguson seems to believe that Murphy and Likert implied, as a result of 
obtaining a higher reliability coefficient with the 1 to 5 method of scoring 
than with the customary Thurstone method of scoring, that they had 
demonstrated that the method of summated ratings does away entirely 
with the need for a judging group. He argues against this and bases his 
criticism on the following grounds: “Since the statements (used by 
Murphy and Likert in the above study) had been sifted through the 
sorting procedure (Thurstone’s), it would seem unjustifiable to conclude 
that Likert’s method did away with the need for a judging group. To 
test this point adequately one should compare scales constructed (inde- 
pendently of the Thurstone method) by the Likert technique with those 
constructed by the equal appearing interval method” (7, p. 52). 

We are in complete agreement with this argument; therefore we find 
ourselves at a loss to understand the following statement of experimental 
design appearing in the same article: 

“A more adequate test can be provided by rescaling items using 
Thurstone’s method in scales constructed by Likert’s technique. If 


of the test taken as a whole. We feel that, at least for purposes of comparison, it is 
not valid to raise the coefficients for the Thurstone scales which are based upon equiva- 
lent forms of 20 to 22 items each. To do so would indicate the reliability to be expected 
from a Thurstone scale of 40 to 44 items, while in practice the scales generally used 
contain only half these numbers. 
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Likert’s technique does away with the need for a judging group, the two 
methods of treating the statements should give the same result’ (7, 
p. 52). 

But this particular experimental design will not give a test of the two 
methods of scale construction; it is an investigation of where Likert- 
selected items will fall along the continuum posited by Thurstone or, 
stated somewhat differently, what Thurstone scale values will be attached 
to the particular items included in a particular Likert-type scale. 

What. Ferguson found by following this line of investigation was that 
Likert-selected items, when scaled according to the method of equal ap- 
pearing intervals, failed to spread evenly over the scale continuum of 
Thurstone; the statements failed to represent all degrees of attitude but 
fell largely at the favorable and unfavorable ends of the scale with the 
middle categories neglected. Only one of the Likert-type scales which 
Ferguson attempted to scale by the Thurstone technique, an economic 
conservatism scale, gave a fairly even spread of items, and the correlation 
between the Thurstone and Likert methods of scoring this scale was 
.70.% Because of these findings, the failure of the Likert-selected items 
to spread evenly over the Thurstone continuum and the “‘low’”’ correlation 
between the Thurstone and Likert methods of scoring the one scale that 
did, Ferguson believes that he has successfully demonstrated “that 
Likert’s technique for the construction of attitude scales does not obviate 
the need for a judging group” (7, p. 57). 

We cannot agree with this conclusion. What has been demonstrated, 
as we pointed out earlier, is that Likert-selected items do not necessarily 
fall at equally-spaced intervals along the theoretical continuum posited 
by Thurstone and Chave. That they do not may be of theoretical inter- 
est, but has little bearing upon the practical problem of whether or not 
there is need for a judging group. This question can only be answered 
in terms of whether or not scales constructed independently by each of the 
two methods will yield comparable scores, i.e., if an individual is a 
standard distance above the mean on one scale, he will be a comparable 
distance above the mean on the second. 

It might seem that the correlation of .70 between the Thurstone and 
Likert method of scoring the economic conservatism scale would bear 
upon the problem. But this correlation is biased in that Ferguson failed 
to give the Thurstone method a fair trial, i.e., he limited the Thurstone 
scale to the items already selected by Likert’s technique. Nor can we 
accept the correlations of .75 and .81 (corrected for attenuation) which 


1 Assuming that the reliability coefficient of this scale when scored by the Thurstone 
method is approximately that obtained when the scale is scored by the Likert method 
(reported by Rundquist and Sletto as .85), and correcting for attenuation, the correla- 
tion would be .82. 
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Murphy and Likert report for their Internationalism Scale and scores on 
the Thurstone-Droba War Scale. These correlations also fail to do 
justice to the question of whether comparable results can be obtained with 
independently constructed Thurstone and Likert scales since it is possible 
that the attitudes under consideration are not the same. 


Comparative Study of the Two Methods 


A valid comparison of the Thurstone and Likert techriques, we be- 
lieve, must start with an original set of items, not with items already 
sifted by the Thurstone procedure and then scored by Likert’s method, 
and not with items sifted by the Likert procedure and then scaled by the 
Thurstone technique. We believe also that the same group of subjects 
should be used in the construction of the two scales, but that the steps 
for each method should be carried through independently. To carry out 
this comparison, we used the original statements of opinion used by 
Thurstone and Chave in the construction of their scale designed to meas- 
ure attitude toward the church. 

Subjects used in the construction of the scale were 72 members of an 
introductory psychology class at the University of Maryland.“ Half the 
class, selected at random, was asked to judge the degree of favorableness 
or unfavorableness expressed by the statements in accordance with the 
Thurstone method, while the other half of the class was requested to give 
Likert-type responses to the same statements. Two days later the pro- 
cedure was reversed; the first half of the class gave Likert-type responses 
to the statements, while the other half gave Thurstone-type responses. 
The Seashore and Hevner (17) method of rating items was used instead 
of the Thurstone and Chave procedure of sorting items into piles." 

In constructing the Thurstone scale, tabulations were made indicating 
the number of judges who placed each item in each of the categories. 
From these data accumulative proportions were determined and ogives 
constructed for each item. Scale values of the items were found by 
dropping a perpendicular to the baseline of scale values at the point where 
the curve crossed the 50 per cent level.“ Q values were determined in a 


% Although Thurstone and Chave used a much larger group of subjects, subsequent 
research (14) indicates that groups as small as 25 or 50 can be used to obtain scale 
values of items and that these values are very similar to those obtained with larger 
groups. 

“4 Seashore and Hevner found that a technique of asking judges to rate statements 
on a scale instead of requesting that they sort the statements into piles yielded results 
which correlated very well with those obtained by Thurstone’s original sorting pro- 
cedure. See also the study by Ballin and Farnsworth (2). 

% The correlation coefficient between our scale values and those obtained by 
Thurstone and Chave 15 years earlier was .95. Our Q values, however, tended to differ 
considerably, the correlation coefficient being only .18. 
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similar fashion by dropping perpendiculars at the 25th and 75th per cent 
levels, the Q value being the scale distance between these two points. 

Items were selected for two “‘equivalent”’ forms of the scale, each form 
containing 20 items. Selection was made on the basis of Thurstone’s 
informal standards, Q values, and scale values. Insofar as possible, the 
final scales contained items with low Q values and with scale values which 
were spread along the entire scale at relatively equally-spaced distances. 
Since only a few items, however, were found to have scale values near the 
center of the continuum and, at the same time, low Q values, this was not 
entirely possible. 

In constructing the Likert scale, a total score for each subject was 
found by summing the weights of responses given for each of the items. 
The upper and lower 10 subjects in terms of total scores served as the 
groups for applying the criterion of internal consistency. Since many of 
the original items would not meet Likert’s a priori screening standards, 
we thought it possible that total scores determined in part by these items 
would include in the criterion groups individuals who might otherwise not 
be represented, i.e., if these items had not been used in the scoring. 
Thurstone and Chave, we might emphasize, had included various ambigu- 
ous items in the original set in order to test Q as a means of statistically 
determining ambiguity. Total scores, therefore, were first determined 
by excluding those items which we felt did not meet Likert’s criteria. 
Total scores were then found with these items included. Since we found 
that the criterion groups would contain essentially the same subjects using 
either score, the total scores based on all of the items were used. 

Twenty-five items, all with a mean difference between the two cri- 
terion groups of 1.8 or higher, were selected for the final Likert scale. 
Approximately half of these items were weighted 5 for a strongly agree 
response and half were weighted 1 for a strongly agree response. Of the 
25 items selected for the Likert scale, 3 were also used in Form A of the 
Thurstone scale and 2 were used in common with Form B. 


Reliability and Comparability 


To obtain data on the reliabilities of the scales and to find out the 
relationship existing between scores on the independently constructed 
Likert and Thurstone scales, members of another introductory psychology 
class and an applied psychology class at the University of Maryland were 
tested. One group of subjects was presented with the Thurstone scales 
followed by the Likert scale; for the second group of subjects the order of 
presentation was reversed. There were 80 subjects altogether, each 
group containing approximately half of this number. 

The reliability coefficient for the Likert scale of 25 items was .94. 
This coefficient compares favorably with those usually reported for scales 
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constructed by this method. The reliability coefficient for the equivalent 
forms of the Thurstone scales of 20 items each was .88. This is compar- 
able to the reliability coefficients of .85 and .89 which Thurstone and 
Chave originally reported for scores on their Form A and B for two differ- 
ent groups of subjects (18, p. 66). 

The correlation coefficient between scores on the Likert scale and 
Form A of the Thurstone was .72, which, when corrected for attenuation, 
becomes .79. On the other hand, the correlation between the Likert 
scale and Form B of the Thurstone was .92. When corrected for attenu- 
ation the coefficient indicates a perfect relationship. Unfortunately, we 
have no way of knowing which of these two coefficients is more representa- 
tive of the “true” relationship existing between scores on independently 
constructed Likert and Thurstone scales in general. But the coefficient 
of .92 between the Likert and one of the Thurstone scales is surely suffici- 
ently high to establish the fact that it is possible to construct scales by the 
two methods which will yield comparable scores. This is the question 
we set out to answer. 


Summary and Conclusions 


Now if we go back and examine the points on which we compared the 
Thurstone and Likert techniques of scale construction, we reach the fol- 
lowing conclusions: 

1. The evidence available indicates that the attitude of the judging 
group is not an important factor determining the scale values of items 
sorted by the Thurstone technique."* 

2. Scales constructed by the Likert method will yield higher reliability 
coefficients with fewer items than scales constructed by the Thurstone 
method. 


3. What evidence we do have seems to indicate that the Likert tech- 
nique is less time-consuming and less laborious than the Thurstone 


6 We are not satisfied with the evidence on this point. Would similar results obtain 
from judgments derived from those with sympathetic attitudes toward fascism and 
those violently opposed to fascism in the construction of a scale measuring attitude 
toward fascism? And in the case of communist sympathizers and non-communists in 
the construction of a scale measuring attitude toward communism? When social 
approval or disapproval attaches to a favorable or unfavorable attitude toward an issue, 
different scale values might result from groups with differing attitudes. An individual 
with a highly generalized unfavorable attitude toward fascism, for example, might 
scale an item such as: “Superior races are justified in dominating inferior races by 
force’ as very favorable toward fascism. But would “native fascists” tend to scale it 
toward the same end of the continuum? The research so far, it seems to us, also neglects 
the related problem of ego-involved attitudes and the bearing they might have upon 
scale values of items. 
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technique. But additional research is needed on this point and should be 
based on carefully kept time records. 

4. It is true that Likert-selected items tend to be those which would 
fall at one or the other extreme on the Thurstone continuum, if scaled ac- 
cording to the Thurstone technique. But the implication of this finding 
is more theoretical than practical as far as the need for a judging group is 
concerned. The important problem is whether scores obtained from the 
two differently constructed scales are comparable and the evidence at 
hand indicates that they are. As far as we can determine there is nothing 
of a practical nature to indicate that a judging group, in the Thurstone 
sense, is a prerequisite for the construction of an adequate attitude scale. 
Received December 18, 1944. 
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The Graphic Item Counter, an attachment for the IBM test scoring 
machine, has come into rather extensive use during recent years. This 
device is designed to provide item analysis data of objective test questions 
by mechanical means. By means of the attachment, it is possible to 
print in graphic form the responses to as many as 90 questions, for a maxi- 
mum of 115 answer sheets. This attachment will also provide the neces- 
sary data for questionnaire analysis, or any type of response-counting, 
provided the original records are in the form of marks in particular posi- 
tions on machine-scoreable answer sheets. 

The item counter is equipped with a plugboard which has one position 
corresponding to each of the 750 possible response positions on the answer 
sheet. One end of each plugwire is connected to the response position to 
be counted, and the other end is plugged to one of the 90 counters in 
which it is desired to record the summation of marks in the particular 
answer space. Total counters are also provided which record the total 
number of sheets run through the machine. A switch is provided by 
means of which it is possible automatically to stop the machine at the 
end of a run of 100 papers. The plugboard may be wired to count the 
number of correct responses to 90 multiple-choice items at one time, or to 
record the number of responses to each of the choices of five-choice ques- 
tions for 18 items at one run. 

To analyze the marks on a group of answer sheets, each of the answer 
sheets is passed through the machine; the motor key is depressed once to 
put the answer sheet in position, and then the counter scans the sheet and 
makes the count of marks. As soon as this scanning is completed, the 
motor key is depressed again, and the sheet is automatically released and 
ejected from the machine. When the last answer sheet has been passed 
through the machine, a blank graphic item count record sheet and a piece 
of carbon paper are inserted in the machine, in much the same way as a 


* The opinions or assertions contained herein are the private ones of the writers 
and are not to be construed as official or reflecting the views of the Navy Department 
or the naval service at large. 
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sheet of paper is inserted in a typewriter. By operating the print-start 
lever, the carriage automatically rolls the record sheet through the ma- 
chine and prints on it a bar graph of the item count. The height of the 
bars for each item indicates the number of responses recorded for that 
item. It is then a simple matter to transfer the number of responses for 
each item, by means of a scale printed on each side of the record form, to 
individual item cards or other permanent record forms. 

It is the purpose of this paper to present quantitative data relating to 
both the time and accuracy of the machine operations when compared 
with those done by hand. Since the clerical personnel is common to the 
tabulation of data derived from both machine and hand operations, this 
study is, basically, a comparison of the speed and accuracy of persons 
using these two methods of obtaining item analysis data. 

The present inquiry was carried out at the Central Examining Board 
of the Naval Air Training Command. The Board prepares uniform tests 
which are administered at several naval training activities. At these 
training activities, students are given periodic objective tests in a number 
of subject matter fields—e.g., Principles of Flying, Aircraft Engines, 
Aerology, etc.—the results of which are analyzed by the Board and re- 
ported upon to the cognizant authorities. 

This study is based upon analyses of answer sheets used exactly as 
they were received from the training activities, i.e., without any remark- 
ing of the sheets. In any situation involving many instructors adminis- 
tering tests to large groups, it is to be expected that not all of the students 
will mark all papers perfectly. Students sometimes record responses 
only to later erase them poorly before substituting other marks. Other 
examinees, despite instructions to the contrary, insist upon using hard 
lead pencils, or else mark too lightly. Greater preciseness of machine 
item analysis results if clerks scan answer sheets, remarking noticeably 
inadequate marks before running them through the machine. In this 
particular study, however, papers were not remarked, a fact which should 
be taken into consideration when viewing the resulting data. 

To obtain a measure of the accuracy of the Graphic Item Counter 
itself, it would be necessary to use groups of perfectly marked papers. 
The present study, however, does not deal with the mechanical accuracy 
of the machine itself, but with the analysis of results obtained from run-of- 
the-mill answer sheets. 

In the present study, groups of 100 test papers (answer sheets) were 
used for each test included, since this gives percentage figures without 
conversion. All tests involved in the study were made up of five-choice 
questions. Since an accurate count of the number responding to each 
choice (students being requested to respond to all questions) should total 
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100 cases for each question, the extent to which the tabulations for each 
question equal 100 is an index of accuracy. The first check consisted of 
seeing to what extent the tabulations for each item approximate 100. 
Table 1 shows the results for tabulations from the machine record for 320 
test items in four different Physics tests, 710 test items from six different 
tests in Principles of Flying, and 640 test items from seven tests in 
Operation of Aircraft Engines. The table gives the percentage of items 
totalling 100, as well as the percentages above and below it. For each of 
the test items there were 100 answer sheets (or examinees). 

As seen in Table 1, 34.4 per cent of the items showed a total, for the 
choice counts, of 100. Within the range of plus-or-minus one per cent 


Table 1 
Totals of the Per Cent Figures for the Five Choices 











Four Physics Six Prin. of Seven Engines All Tests 
Tests Flying Tests Tests Combined 
Percentage No. No. No. No. 

Totals Items % Items % Items % Items % 
9 andbelow 12 3.7 43 6.0 18 2.9 73 44 
97 13 4.1 56 7.9 48 7.5 117 7.0 
98 54 16.9 120 16.9 96 15.0 270 16.2 
99 92 28.8 208 829.3 147 22.9 447 26.8 
100 104 32.5 209 «4329.4 262 40.9 575 34.4 
101 37 11.6 42 5.9 48 7.5 127 7.6 
102 4 1.2 9 1.3 15 2.3 28 1.7 
103 2 0.6 2 0.3 5 0.8 y 0.5 
104 and above 2 0.6 21 3.0 1 0.2 24 1.4 
Totals 320 100.0 710 100.0 640 100.0 1670 100.0 





were found 68.8 per cent of the items. A range of plus-or-minus two per 
cent includes 86.7 per cent of the cases, while within a range of plus-or- 
minus three per cent are found all but 5.8 per cent of the cases.' It is 
also evident that in tabulation from machine records there is a much 
greater tendency to go below rather than above the 100 point. The per- 
centage of totals below 100 is 54.4, while the percentage of totals above 
100 is only 11.2. This probably results from the fact that there are very 
few cases in which there are extra marks on the answer sheets which result 
in a total count of more than 100. On the other hand, it is more often 
the case that students fail to answer a particular question (despite 
directions to answer all questions), or else mark so poorly that the machine 
does not register them. 

In order to see the extent of agreement between two tabulations made 


1 In actual practice, the board rechecks all cases that vary to any appreciable degree. 
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from machine data, sets of examination papers were run through, by 
regular clerks, on two different machines (without remarking the papers). 
These two runs were made to see what agreement exists, for the correct 
choices only, between the two separate machine runs (using two machines 
and two clerks). Table 2 shows the direction and degree of divergence 
of the second run when compared with the first. 

From Table 2, it is evident that in tabulations from machine analysis 
a tolerance of plus-or-minus one per cent will give tabulations which are 
accurate in 71.7 per cent of the cases. (Agreement of two separate tabu- 
lations of machine data is in this case the assumed criterion of accuracy.) 


Table 2 
Percentage Differences Obtained from a Second Run (Correct Choice Only) 








Four Physics Six Prin. of Seven Engines All Tests 
Tests Flying Tests Tests Combined 





Percentage No. No. No. No. 
Differences Items % Items % Items % Items 





2.5 4 0.6 12 : 24 
2.8 48 6.7 4 61 
10.0 71 10.0 22 y 125 
12.5 99 13.9 144 ' 283 


+4 and above 
+3 
+2 
+1 


18.7 156 22.0 100 . 316 
=) 10.3 69 9.7 59 : 161 
-3 3.8 12 1.7 18 2.8 42 
—4andbelow 30 9.4 7 1.0 23 J 60 


Totals 320 100.0 710 100.0 640 100.0 1670 100.0 


=] 


8 
9 
32 
40 
0 96 30.0 244 34.4 258 . 598 
60 
33 
2 





A tolerance of plus-or-minus two per cent includes 88.8 per cent of the 
cases studied, while a tolerance of plus-or-minus three per cent includes 
95.0 per cent of the cases. Five per cent of the cases show deviations as 
great as, or greater than, plus-or-minus four per cent. 

Whether or not the obtained deviations from “accuracy”’ are too great 
for a particular testing program depends upon at least two factors. The 
first of these is the exactness of measurement involved in the testing itself. 
This, in turn, is a function of the validity and reliability of the tests ad- 
ministered and the adequacy of the sample used in making analyses. 
The second factor to be considered is the exactness required in the analy- 
sis data. This factor is generally the determining one in most situations. 
(Certainly, the reporting of the results of testing should not imply any 
greater precision than is noted in the analysis data themselves.) Al- 
though the matter of the required degree of accuracy of item analysis 
must be dealt with in terms of the specific needs of a specific testing pro- 
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gram, the writers feel that the degree of accuracy indicated above is 
probably adequate for most programs. 


Accuracy of Tabulations from Machine Compared 
with Tabulations by Hand 


For purposes of comparing the accuracy of item analysis tabulations 
made by hand with those made when using the Graphic Item Counter, 
six 40-item tests were selected. These six tests were composed of two 
each in the subjects Aerology, Engines, and Principles of Flying. A 
check was made for each of the five choices in the six 40-item tests, a 
total of 1,200 choices. As above, the data used for each item were for 
100 answer sheets (or examinees). Accuracy was determined by first 
making a tabulation from the Graphic Item Counter records, then making 
a hand tabulation. Whenever the two tabulations were in agreement, 


Table 3 
Number of Cases of Incorrect Tabulations 

















Type Errors Total Errors 

0 

Tabulation +1 +2 No. % 
Hand 161 92 253 71.9 
Machine 50 26 76 21.6 
Both 15 8° 23 6.5 
Totals 226 126 352 100.0 





it was judged that the tabulation was correct. Whenever discrepancy 
existed between the two separate tabulations, an additional tabulation 
was made by hand. Agreement with the second hand tabulation was the 
basis for determining whether the machine tabulation or the hand tabula- 
tion had been inaccurate. 

The hand tabulations were in perfect accord for all five choices with 
those which had been made with the Graphic Item Counter in the case of 
40 of the 240 test questions. Of the remaining 200 items, 38 items 
showed agreement on the correct choices. This gives 78 cases (32.5 per 
cent) in which the two tabulations agreed with respect to the percentages, 
selecting the correct choices. (This may be compared with the 34.4 per 
cent of perfect agreement in Table 1.) Of the total of 1,200 choices (240 
questions), there was agreement between hand and machine tabulations 
on 848 choices. The two tabulations were at variance in 352 of the 1,200 
choices. Table 3 shows the results obtained when the second hand tabu- 
lation was made to determine which of the two previous (one hand and 
one machine) tabulations had been in error, as well as the size of the error. 
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As seen in Table 3, of the 352 cases of hand and machine disagreement, 
the machine check was off in 76 cases, the hand tabulation was off in 253 
cases, while in the remaining 23 cases both hand and machine tabulations 
were inaccurate. Of the total of 352 cases of error in tabulating re- 
sponses, 118 deviated two or more per cent. Of these 118 cases deviating 
by two or more per cent, hand tabulations were responsible for 92 of the 
118 cases. In percentage terms, the hand tabulations were responsible 
for 71.9 per cent of the errors, the machine tabulations for 21.6 per cent, 
while 6.5 per cent of the errors occurred in both types of tabulations. 

In terms of the total number of responses in the six tests, 1,200, the 
tabulations made from machine data were in error in 6.3 per cent of the 
1,200 choices; the hand tabulations were in error in 21.1 per cent of the 
1,200 cases; and both hand and machine tabulations were in error in 1.9 
per cent of the 1,200 cases. From this, it appears that when tabulations 
are made for Graphic Item Counter data, item analysis tabulations are in 
error only about one-third as often as when tabulations are made by hand 
exclusively. 

In terms of errors in excess of plus-or-minus one per cent, machine 
tabulations showed such errors in 2.2 per cent of the 1,200 cases, while 
the hand tabulations showed such errors in 7.7 percent of the 1,200 cases. 
Again, the machine tabulation is inaccurate to a degree of plus-or-minus 
two per cent only one-third as often as is the hand tabulation. Put in 
positive terms, the tabulations from the machine record are within two 
per cent of accuracy in 97 per cent of cases, while the hand tabulations 
are within two per cent of accuracy in 92 percent of the cases. 


Time Required for Hand Tabulations Compared 
with that for Machine Tabulations 


The investigation of the accuracy of machine versus hand item analy- 
sis presented above also resulted in some interesting comparative data on 
the time required for the two types of tabulations. Table 4 shows the 
time requirements for the hand and the machine operations referred to in 
the preceding pages. 

The time data for the hand tabulations show considerable variation. 
This is due, no doubt, to the fact that four clerks prepared the first four 
sets of hand tabulations, a fifth preparing the last two listed. The tabu- 
lations made from the machine item analysis were made by the clerk who 
. had prepared the hand tabulations for Flying tests A and B (the two re- 
quiring the least amount of time). 

For making the machine item analysis, 30 minutes were required for 
the clerk to wire the plugboards for counting the 200 choices in the 40- 
item, five-choice tests. Two plugboards were used. The data for the 
first 18 items were obtained by using the first plugboard, the data for 
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Table 4 
Tabulation Time Record 
Hand-Tabulation GIC-Tabulation 
Test Time Time 
Aerology A 6 hr. 40 min. 32 min. 
Aerology B 6 hr. 30 min. 30 min. 
Engines A 5 hr. 25 min. 30 min. 
Engines B Shr. 0 min. 30 min. 
Flying A 5 hr. 20 min. 29 min. 
Flying B 4 hr. 45 min. 31 min. 
Wiring time 30 min. 
Totals 36 hr. 40 min. 3 hr. 32 min. 





items 19-36 by using the second plugboard. To obtain a count on items 
37-40, the wires on the second plugboard, for the last four items (33-36), 
were merely moved down four rows. It should be noted that because 
the tests used had 40 five-choice items it was necessary to make a third 
run on the machine to get the data on the last four items. If four-choice 
items for a 40-item test had been used, the necessary data could have been 
obtained in two runs. Similarly, for five-choice items it is possible with 
three runs through the machine to obtain data on as many as 54 items. 
Thus, in making comparisons with hand tabulations, the differences are 
: not in this instance the maximum which are possible. 
: The average time required to run the six tabulations using Graphic 
: Item Counter item analysis was about 35 minutes, including wiring time. 
The average time for the hand tabulations was about six hours. The 
: average time required for the hand tabulation for the clerk who ran the 
machine analysis was about five hours as compared with an average time 
of about 35 minutes per machine tabulation. In round figures, the clerk 
making the speediest hand tabulations required over eight times as long 
for hand analyses as was needed for tabulations made with the Graphic 
Item Counter. 


Summary 


1. Item analyses made by clerks using the Graphic Item Counter 
attachment of the IBM test scoring machine are more accurate than those 
made by hand tabulation alone. 

2. Item analysis tabulations made by clerks using the Graphic Item 
Counter are sufficiently accurate to meet the requirements of most testing 
programs. 

3. Item analysis tabulations made by hand take more than eight 
times as long as do those made with the Graphic Item Counter. 


Received March 24, 1945. 





The Interrelationship of Visual Acuity at Different Distances* 


William James Giese 
Division of Education and Applied Psychology, Purdue University 


In present day employment methods it is almost universally standard 
practice to administer a test of visual acuity. In many companies this 
test consists of the standard Snellen chart administered at a distance of 
twenty feet. The tacit assumption is made that normal vision, 20/20 
rating on the Snellen test, is “good” vision; the practice is that for jobs 
requiring “good” vision personnel should rate 20/20 on the test. In 
practice this method of personnel allocation is often used on the basis of 
arbitrary judgment, and when validation studies are made it is sometimes 
found that this method of vision testing is far from satisfactory. Indeed, 
in some instances the use of a single distance acuity test actually elimi- 
nates the potentially successful workers and selects those visually unfitted 
for the job. Tiffin and Wirt in their investigation of visual acuity and 
hourly pr»duction of hosiery loopers found a correlation of approxi- 
mately — .60 between visual acuity measured at twenty feet and pro- 
duction. The work-of these loopers was at a visual distance of only 
eight inches which presents quite a different visual task than passing a 
test of visual acuity at twenty feet. From the point of view of the place- 
ment of industrial employees in jobs for which they are visually qualified 
it is important to know what the relationships are between visual skill 
tested at various distances. In addition, it is desirable to know the 
variation of average acuity for different distances as well as the variation 
of the spread of individual differences about the mean acuity for different 
distances. 

The present research was undertaken to provide an answer based on 
experimental evidence to the following three main points: 


(1) To determine experimentally the relation between visual acuity 
at various distances. 

(2) To determine the average visual acuity in terms of minute angles 
at various distances. 

(3) To determine the extent of individual differences in visual acuity 
at various distances. 


* This article is based on the author’s thesis of the same title submitted to the 
faculty of Purdue University in partial fulfillment of the requirements for the degree 
of Doctor of Philosophy, October, 1945. The thesis was directed by Dr. Joseph Tiffin. 

1 Joseph Tiffin and 8. E. Wirt. Near vs. distance acuity in relation to success on close 
industrial jobs. Trans. Amer. Acad. Ophth. and Otolaryng. (June 1944), Suppl. pp. 6-16. 
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Procedures 


A multiple choice checker test for visual acuity? was chosen as the 
instrument for measuring visual acuity (see Figure 1). This visual 
target is so proportioned that for “normal” vision at any distance the 
individual checkers in the test square subtend a visual angle of one 
minute while the individual checkers in the four remaining control 
squares subtend a visual angle of 12 seconds. The checker design in 
the control squares is so fine that for individuals of exceedingly high 





Fie. 1. Checker board design. 


acuity they blend into a neutral grey. All of the visual targets were 
: mounted in the center of an eight by eight inch grey card which matched 
: the grey of the control squares. These cards were presented to the 
: subjects in a specially designed box which kept such factors as illumina- 
: tion and ocular distance constant. When this design presents a visual 
: angle which is below a subject’s visual acuity to resolve, the test square 
blends into a neutral grey matching the control squares leaving the 
subject’s response one controlled by chance. 
The checker design was used instead of the more customary letter 
chart because of the following limitations of the letter chart: 


(1) Measures in addition to visual acuity readability of letters.* 

(2) Does not lend itself readily to safe reduction especially for short 
distances. 

(23) Adequately differentiates only substandard levels of acuity while 
at standard and superior levels it differentiates only grossly.* 

(4) The reliability and validity of the chart as a whole are difficult to 
establish since these factors are different for each letier in the 
chart.‘ 


* Developed by the Scientific Bureau of the Bausch & Lomb Optical Co., Rochester, 
New York. 

J. P.8. Walker. Test type. Brit. J. Ophthal. (1942), 26, 556-559. 

‘Joseph Tiffin, Industrial psychology. New York: Prentice-Hall, Inc., 1944, pp. 
128-129. 
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Each of these four limitations of the letter chart are strong reasons 
for the use of the multiple choice checker visual acuity test design since 
it has been carefully designed so that the disadvantages of the letter 
test have been eliminated or greatly reduced. The apparatus was set 
up so that eight levels of visual acuity could be obtained at the distances 
of .20, .25, .33, .40, .50, 1.00, 5.00, and 10.00 meters which require a 
diopter change in focal power from infinity of 5, 4, 3, 2, 1, .2, and .1 
respectively. The room in which the apparatus was installed was large 
and well ventilated and fitted with black out shutters which were ad- 
justed so the illumination in the vicinity near the apparatus was under 
one foot candle. The illumination on the visual target was eight foot 
candles which is one foot candle more than is recommended by Ferree 
and Rand as the optimum illumination for the purpose of obtaining 
measurement of visual acuity. The purpose and method of taking the 
test were explained through standard instructions to each subject. Half 
of the subjects started at the near distance and worked back to the far 
distance while half started at the far distance and worked down to the 
near distance. At each distance the subject started with the target 
with the smallest visual acuity rating (the largest visual angle) which 
was increased by .2 visual acuity rating steps until he made a wrong 
judgment. The subject was then presented with the previous target on 
which he was asked to make four judgments. If he failed to make four 
correct judgments he was then presented with the target of the next 
larger visual angle and again asked to make four judgments. This 
procedure was continued until the subject was able to make four suc- 
cessive correct judgments. The subject’s visual acuity rating for any 
distance was, then, the smallest target on which he could make one 
original and four later successive correct judgments. Since there are 
four alternatives for each of the five judgments, a subject has only one 
chance in 1024 to make five correct judgments when he cannot see the 
location of the checkers. Four hundred subjects were tested for their 
visual acuity by this method, and eighty-nine additional subjects were 
tested with the only change in the procedure a retest at each distance 
for the purpose of determining the reliability of the tests. All of the 
subjects had previously been screened by a 20/20 standard at twenty 
feet on a letter chart. The median age of the group was 19 with none 
under 17, but some as old as 36. 


5 C. E. Ferree andG. Rand. New ideas in eye testing. Personnel J. 1939, 18, 13-20. 

*A complete detailed description of the experimental procedure is given in W. J. 
Giese, ‘The interrelationship of visual acuity at different distances,’ a thesis submitted 
to the faculty of Purdue University in partial fulfillment of the requirements for the 
degree of Doctor of Philosophy in Psychology, October, 1945. 
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Results 


Reliabilities. Table 1 summarizes the statistics concerned with the 
test-retest reliabilities of the multiple choice checker acuity test for 
various distances. With these subjects acuity measured at 5.00 and 
10.00 meters is more reliable than acuity measured at shorter distances. 
Probably the reason for this is that fluctuations in accommodation and 
convergence are more likely to accur in near than in far vision tasks. 
Such fluctuations could account for the somewhat lower reliabilities of 
the near vision tests. Obtained Pearson Product Moment coefficients 
of correlation corrected for grossness of grouping appear under column 
headed r,,’2,° which show that since the lower reliabilities of the near 
tests are not due to variation in grouping there are factors other than 


Table 1 


Test-Retest Reliabilities of the Multiple Choice Visual Acuity 
Checker Test at Various Distances 











Distance 





in Meters rsjzq Tey’sy’ €29'x1’ Nese, e921 x? P €z;'z2" “Nziza 2122 x? P 
20 87 92 92 .88 87 590 32 91 87 86 281 .7 

25 .72 .77 + #=81 .78 .76 «18.49 .0024 80 .75 .74 849 .19 

33 82 87 «£90 .86 85 19.56 .0017 92 83 82 3.65 .44 

40 80 84 £89 .£86 .85 30.36 .00001 85 .£83 .81 10.56 .06 

5 80 & 8 82 81 627 2 84 82 .80 5.55 .36 
1.00 82 87 £87 83 82 321 .67 86 83 81 1.72 .87 
5.00 91 97 98 92 92 8.74 .12 97 92 91 409 .54 
10.00 .93 1.00 100 93 93 3.07 .55 100 93 93 3.55 .47 





coarse test step intervals which reduce the reliability. If it were possible 
to decrease the size of the test step intervals without altering the fatigue 
factor one should not expect the reliability coefficients to be greater 
than those listed under column r,,/.,.. Through a comparison of column 
Tz,2, With column r;,’2,- the conclusion can be drawn that although the 
increase in reliability would be desirable it would not have been feasible 
to attempt to achieve it through finer test step intervals especially since 
they would have greatly increased the length of time for test administra- 
tion. For all but three of the near distances a linear regression line gives 
a satisfactory fit, and it is only when efa is computed in the initial test 
scores is there statistical significance to the fact that some line other 
than a straight one would be a more satisfactory fit.’ An inspection of 


* Chi Square = (NV — Kym Ee C. C. Peters and W. R. Van Voorhis. Statis- 


tical procedures and their mathematical bases. New York: McGraw-Hill Book Co., Inc., 
1941. 
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the scattergrams showed that this was caused by a selective regression 
to the mean on the retest. The cases with very high acuity on the initial 
test, as a group, did more poorly on the retest, but the cases with low 
acuity ratings, as a group, did not shift one way or the other on retest. 
However, by a comparison of columns rz,.,, €2,2;) €z,2,, it can be seen that 
although the non-linearity of the reliabilities of the distances of .25, 
.33, and .40 meters has high statistical significance, the difference in 
degree of relationship expressed by a linear or curvilinear coefficient is so 
small that they have no practical import. Another factor that should 


Table 2 
Comparison of Means and Standard Deviations on Initial and Retest 








Distance 


Initial 
Mean SE 


Retest 
Mean SE 





89+ .0225 

.98+.0213 
1.19+.0232 
1.27+ .0256 
1.29+.0254 
1.49 +.0365 
1.60+.0315 
1.54+.0292 


Initial 
Sigma SE 


89+ .0218 

-98+.0210 
1.18+.0210 
1.27 +.0256 
1.26+.0249 
1.51 +.0359 
1.61+.0316 
1.56 + .0293 


Retest 
Sigma SE 


Diff 
I-R 


018 


O11 
022 
013 
011 


SE Diff. 


P 





-50 
1.00 
5.00 

10.00 


-2121+.0159 
.2012+.0151 
-2190+.0164 
.2416+.0181 
-2396 + .0180 
-3444 + .0258 
2968 + .0222 
.2751 +.0206 


-2053 + .0154 
-1981+.0148 
-1981+.0148 
-2419+.0181 
.2349 + .0176 
3391 + .0254 
-2982 + .0224 
.2767 + .0207 


0068 
0031 
0209 
— .0003 
0047 
0053 
— .0020 
— .0016 


0080 
0112 
0094 
0115 
0118 
0154 
0095 
.0077 


40 
76 
03 
.98 
.70 
-70 
83 
83 





be isolated is the effect of learning. Table 2 lists the means on initial 
test and retest as well as the difference between means and the proba- 
bilities that the difference could have arisen by chance. Similar data is 
presented for the standard deviations. There is only one P value, for 
the .50 meter distance, which indicates a statistically significant shift 
in the means, and it is in the direction of negative learning. It is, 
however, a rather small absolute shift, .03, when compared to the step 
intervals of the tests, .20. With the exception of this small negative 
shift there is no learning as revealed by shifts in the means. A similar 
analysis was made for shifts in the standard deviations in which the 
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smallest P value was .03 and all but two of the P values were .70 or above. 
Also, four of the shifts were in the direction of smaller standard devia- 
tions on the retest while four shifted in the direction of larger standard 
deviations on the retest. The P value of .03 was for a smaller standard 
deviation on retest for the .33 meter distance which was one of the dis- 
tances for which a straight line had a very low probability for the best 
fitting regression line. 

In summary, the statistics on the test-retest reliabilities show that 
the tests have adequate reliability, and that the means and standard 
deviations are stable from test to retest. 

The Relationship of Visual Acuity at Different Distances. The greater 
the difference in diopters of accommodation required in the vision tests 
the less the degree of relationship between them. Stated in terms of 


Table 3 


Intercorrelation of Visual Acuity at Various Distances 
(Obtained Pearson Product Moment Coefficients of Correlations) 








Distance .20 25 33 40 50 100 5.00 10.00 





-20 .87* 
25 50 72° 
33 46 43 82* 
40 Al 39 52 .80* 
50 39 A2 42 A9 .80* 

1.00 .29 27 35 -50 -50 .82* 

5.00 17 27 34 38 45 53 91° 

10.00 33 36 37 Al Al 32 56 .93* 





* Reliability coefficients from a sample of 89 cases apart from the sample of 400 cases 
used in the intercorrelations. 


test distances this means that the greater the difference in the test 
distance between two vision tests the smaller the relationship between 
the acuity measures. Table 3 gives the obtained correlation matrix 
from which it is clear that all of the interdistance correlations are sub- 
stantially smaller than the coefficients of reliability. Correlations be- 
tween adjacent distances have a median of .52 while the median relia- 
bility correlation is .82. Table 4 summarizes the effect of differences in 
distances and degree of relationship between tests. 

Although the differences which appear in the reliabilities for various 
distances could not radically change the pattern of the results, they might 
partially obscure or distort them. The range in the reliabilities is from 
.72 to .93 so some of the differences in interdistance correlations could 
be due to the fact that the reliabilities varied. Table 5 presents the 
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Table 4 


Median Correlations by Amount of Separation Between Distances 
(Obtained Pearson Product Moment Coefficients of Correlation) 








Reliability Adjacent 
3 4 5 6 


0 Distances 1 Distance Distances Distances Distances Distances Distances 
away away away away away away away 


_ 


82 52 44 Al 36 .29 -26 











matrix of correlation in Table 3 corrected for attenuation with the results, 
of course, that all of the correlations are somewhat higher. However, 
the essential relationship between the correlations not only has the same 
structure but this structure is more clear. For instance, the range of 
correlations between adjacent distances is only from .60 to .64, but in 


Table 5 
Intercorrelation of Visual Acuity at Various Distances 
(Obtained Pearson Product Moment Coefficients of Correlation 
Corrected for Attenuation) 








.20 25 33 40 50 1.00 5.00 10.00 





63 

55 

49 

, AT 
1.00 33 
5.00 19 
10.00 37 


RERRES 


-60 
36 


61 





the uncorrected matrix the range was from .43 to .56. Table 6 presents 
the corrected median correlations for easy comparison to Table 4. 
Table 6 shows the relationship between distances if the measures 
had perfect reliability. Even with perfect reliability the median cor- 
relation with an adjacent distance would be only .60. If the distances 


Table 6 


Median Correlations by Amount of Separation Between Distances 
(Pearson Product Moment Coefficients of Correlation Corrected for Attenuation) 








Reliability Adjacent 
3 4 5 6 


0 Distances 1 Distance Distances Distances Distances Distances Distances 
away away away away away away away 


1.00 .60 52 48 43 33 31 
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of .40 and 5.00 meters are eliminated from the matrix in Table 5, the 
difference between distances would then represent one diopter change 
in focal power required with the exception of 1.00 to 10.00 meters which 
is .9 instead of 1.0. Although this reduces the number of correlations 
from which to draw a median, it has the advantage that the medians 
can be listed for different amounts of diopter change required, a more 
meaningful concept in terms of the visual task confronting the eye. 


Table 7 
Median Correlations by Amount of Diopter Change in Focal Power Required 
(Pearson Product Moment Coefficients of Correlation Corrected for Attenuation) 








Reliability 1 2 3 4 5 
no change Diopter Diopters Diopters Diopters Diopters 





1.00 -60 51 A2 38 37 





Table 7 clearly shows that the greater the diopter change in focal 
power required of the eye from one visual acuity test to another the 
less the degree of relationship between the two measures. Even one 
diopter of change required greatly reduces the relationship. For three 
diopters or more change the relationship between tests is very low. 
Once the visual task has been changed three diopters, it might as well 
be changed four or five since the relationship drops only minutely for 
the greater changes. 

All of the interdistance correlations presented so far have been linear 
coefficients. Inasmuch as coefficients of curvilinear correlation might 
give a different structure to the results, a matrix of correlation ratios is 
presented in Table 8. Here again, the general pattern of the results 


Table 8 


Intercorrelation of Visual Acuity at Various Distances 
(Correlation Ratios—Eta) 








Distance .20 25 33 40 50 100 500 10.00 





.20 88* 
25 51 .78* 
33 49 44 88* 
40 46 42 53 86* 
-50 44 44 42 50 82* 

1.00 44 31 40 55 53 .83* 

5.00 -26 31 37 44 A7 55 92° 

10.00 34 36 40 A2 A2 33 57 .93* 





* Reliability coefficients from a sample of 89 cases apart from the sample of 400 cases 
used in the intercorrelations. 
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Table 9 
Median Correlation Ratios by Amount of Separation Between Distances 








Reliability Adjacent 





3 4 5 6 
0 Distances 1 Distance Distances Distances Distances Distances Distances 
away away away away away away away 


87 | 53 44 44 39 40 31 








remains the same, and a summary of the median relationship between 
differences in distances and amount of relationship, Table 9, shows no 
change in the effect noted with the linear coefficients with the exception 
that when the distances between the tests are great, the relationship is 
not reduced quite so much. 

Table 10 shows that when efas are used as a measure of interrela- 
tionship a change of just one diopter in focal power required makes the 
relationships between tests very low. Even though there is a consistent 
downward trend of the relationships as the change in diopters of accom- 
modation increases, additional changes beyond one in diopter require- 
ments between tests do not attenuate intertest relationships more than 
minutely. 


Table 10 
Median Correlation Ratios by Amount of Diopter Change in Focal Power Required 








Reliability 1 2 3 4 5 
no change - Diopter Diopters Diopters Diopters Diopters 


85 44 43 40 35 34 








One objection to efa as a measure of relationship is that it is generally 
too large since chance variations may often be of such a nature as to 
reduce the variance with the columns as compared to the variance of 
the total distribution. This objection is particularly cogent when applied 
to data composed of few cases and separated into many columns. The 
interrelationship figures in this study are based on over 400 cases which 
should be adequate to minimize the effect of chance variation, especially 
since the number of columns is always 7 or less. In the test-retest data, 
however, the number of cases is only 89, though again, the number of 
columns is always 7 or less. Because of the objections to efa as a meas- 
ure of correlation the correlations are also expressed by epsilon, a 
correlation ratio without bias. These appear in Table 11. The maxi- 
mum decrease from the etas is .03 for the interrelationship ratios and 
.03 for the intra-relationship ratios. Again, the general structure of the 
results is similar. In the interest of completeness these correlation 
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Table 11 


Intercorrelation of Visual Acuity at Various Distances 
(Correlation Ratios without Bias—Epsilon)t 











Distance .20 25 33 40 50 100 5.00 10.00 

.20 .87* 
25 50 .76* 
33 48 44 .85* 
40 45 41 52 . 85° 
50 41 43 42 49 81* 

1,00 43 30 39 55 52 .82* 

5.00 23 .29 36 43 46 55 .92° 

10.00 32 35 39 Al 42 31 56 .93* 





t All of the above ratios reach the 1% level of significance. 
* Reliability coefficients from a sample of 89 cases apart from the sample of 400 cases 


) used in the intercorrelations. 
ratios without bias are corrected for grossness of grouping and the 
resulting matrix is shown in Table 12. An inspection of this table shows 


that with only a few minor reversions, the relationship between acuity 
measures decreases as the difference in diopters of focal power increases. 


Table 12 


Intercorrelation of Visual Acuity at Various Distances 
(Correlation Ratios without Bias—Epsilon corrected for Broad Categories) 








Distance .20 25 33 40 50 100 5.00 10.00 





.20 .92* 

25 56 81* 

33 52 49 

40 50 46 ; 

-50 45 49 54 85* 


1,00 AT 33 
5.00 25 32 
10.00 34 38 


89* 

60 54 .87* 

AT 50 59 .98* 

44 44 33 59 1.01* 


eebbRs 





* Reliability coefficients from a sample of 89 cases apart from the sample of 400 cases 
used in the intercorrelations. 

The Relationship Between Mean Acuity and Distance. The mean 
visual acuity steadily becomes smaller as the test distance becomes 
smaller. Table 13 gives both the mean acuity and the standard devia- 
tion about the mean, and Figure 2 presents the same data graphically. 
The data show that for distances requiring two or more diopter change 
in focal power the spread of individual differences is significantly lower 
than those distances requiring one diopter or less. Also, from the dis- 
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tance of 1.00 meter down to .20 meter (1 diopter to 5 diopters) the mean 
steadily decreases as the distance decreases. This result agrees with the 
findings of Luckiesh and Moss in which they determined the acuity 
threshold of 10 subjects for the distances of .60, 1.20, and 2.80 meters and 
found that for each subject acuity, measured by the reciprocal of the 
visual angle in minutes subtended by the critical detail, increased with 
increase in distance between observer and stimulus. Comparing Table 


Table 13 
Visual Acuity: Means and Standard Deviations in Visual Decimal Obtained on the 
Multiple Choice Checker Test 
(Obtained from 400 cases used for the intercorrelations) 








Distance in Meters Mean+SE Standard Deviation +SE 





25 
33 
40 
-50 
1.00 
5.00 


.95+.0100 
1.05+ .0098 
1.28+ .0095 
1.37 +.0083 
1.39+.0099 
1.63+.0142 
1.61+.0154 


-199+.0070 
-194+ .0069 
-190 + .0068 
-166+ .0059 
-198+ .0070 
.283+.0101 
307 + .0109 


10.00 1.35+.0153 -305 + .0108 





13 with the tables of the correlations, no consistent relationship between 
any of the Pearson Product Moment coefficients of correlation (obtained, 
corrected for attenuation, or corrected for broad categories) or the 
three correlation ratios and the standard deviations is found. In addi- 
tion, there is no relationship between variation in the means and varia- 


tion in the size of the measures of correlation. One fact does stand 
out, however; the nearest and the two farthest distances have the highest 
reliabilities. Of the three highest sigmas for the test-retest data, two 


Table 14 


Differences in Means of the Near Point Tests from All Farther Point Tests Along with 
Their Standard Errors of the Difference 











-10+.0141 
-33 +.0140 
42+.0131 
44 +.0142 
-68 +.0177 
-66 +.0186 
40 +.0185 


-23 +.0138 
382 +.0122 
-34+.0141 
-58 +.0174 
-56 +.0184 
-30 +.0184 


09 +.0127 
-11+.0139 
-85 +.0173 
33 +.0183 
07 +.0182 


02 +.0130 
-26 +.0166 
-24+.0176 
—.02 +.0176 


-24+.0175 
22 +.0185 
—.04+.0184 


—.02 +.0211 


—.28+4.0211 —.26+.0219 





*M. Luckiesh = F. K. Moss. The eeeeeny of visual acuity upon stimulus 
distance. J. opt. Soc. Amer., XXIII, pp. 25-29 
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are for the distances of 5.00 and 10.00 which also have the two highest 
reliabilities. Although the sigma for the .20 meter distance is small, 
the distribution of acuity scores at this distance is substantially normal 
which should, partially at least, account for a somewhat higher relia- 
bility at this distance. The differences between the means are tabulated 
in Table 14 which shows that except for the 5.00 and 10.00 meters 
distances there is a consistent increase in the mean performance as the 
number of diopters of accommodation required decreases. But how many 
of these shifts have statistical significance? The critical ratios for the 
shifts are presented in Table 15, an inspection of which shows that all 


Table 15 
Significances of the Differences (Critical Ratios) in Table 14 











Distance .20 25 33 40 50 1.00 5.00 
-20 
25 7.09 
33 23.57 16.67 
40 32.06 26.23 7.07 
50 30.99 24.11 7.91 1.54 
1.00 38.42 33.33 20.23 15.66 13.71 
5.00 35.48 30.43 18.03 13.64 11.89 95 
10.00 21.62 16.30 3.85 1.14 2.17 13.27 11.87 





of the shifts are significant except. .50-.40, 10.00-.40, 10.00-.50, and 
5.00-1.00 when the rule of thumb of a critical ratio of 3.00 or better 
for significance is used. Not only do the majority of the shifts have a 
very high statistical significance but they also have practical import 
when the shift of the total distribution is considered. In addition to a 
shift in the means, there is a difference in the spread of the distributions. 
Table 16 tabulates the differences in spread which shows, in general, 


Table 16 


Difference in Standard Deviations of the Near Point Test from All Farther Point Tests 
Along with Their Standard Errors of the Difference 














.20 
25 —.005+.0099 
33 —.009+.0098 —.004+.0098 
A0 —.033+.0092 —.028+.9092 —.024+.0091 
50 —.001+.0100 .004+.0099  .008+.0998 .032+.0092 
1.00 .084+.0124  .089+.0123 .093+.0123 .117+.0118 .085+.0124 
5.00 .108+.0131 .113%.0130 .117+4.0130 .141+.0125 .109+.0131 .024+.0150 


10.00 .106+.0130 .1114.0130 .115+.0129 .139+.0124 .107+.0130 .022+.0149 -—.002+.0155 
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that the near point tests have a much more restricted range than do the 
far point tests. This is also graphically illustrated in Figure 2. The 
significance of these shifts are shown on Table 17. Although not as 
many of the differences in standard deviations have significance as did 
the differences in means, there are numerous highly significant differ- 
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Fig. 2. Visual acuity means and standard deviations obtained on the 
multiple choice checker test. 


ences. In general, the differences between the tests at .50 meters or 
less and the tests at 1.00 meters or more are very reliable. 

Not only do the means and standard deviations show a consistent 
trend with changes in test distance but also the shape of the distribution 
of individual differences shows an interesting relationship with test 
distance. Figure 3 shows that for the two near point tests (.20 and 
.25 meters) the spreads of individual differences are fairly normal, but 
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Table 17 
Significance of the Differences (Critical Ratios) in Table 16 
Distance .20 .25 33 40 50 1.00 5.00 10.00 
.20 
-25 51 
33 .92 Al 
40 3.59 3.04 2.64 
50 10 40 82 3.48 


100 677 724 7.56 9.92 685 
5.00 824 869 9.00 1128 832 1.60 
1000 815 854 891 1121 823 £148 4.18 





as the test distance is increased up to 1.00 meter the distributions become 
more and more negatively skewed. The distribution for the 5.00 meter 
distance is nearly as badly skewed as is the distribution for the 1.00 
meter distance. The spread of individual differences at the 10.00 meters 
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Fic. 3. Histogram on the spread of individual differences at different distances on the 
multiple choice checker test. 
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distance, although not normal, is not skewed nearly as much as the 
distributions for .50, 1.00, and 5.00 meters. A scanning of all of the 
distributions shows that for these subjects the optimum distance for 
acute vision is between 1.00 meter and 5.00 meters. 


Conclusions and Implications 


1. Although the relationship between tests of visual acuity at dif- 
ferent distances is statistically significant, it is not high enough to 
presume that a measure of visual acuity at one distance will any more 
than very roughly predict visual acuity at another distance. This result 
is more clearly stated in terms of the task confronting the eye rather 
than in given changes in test distances in terms of feet or meters. For 
example, a change of only .30 of a meter from .20 meters to .50 meters 
is a change of 3 diopters of focal power required whereas this change 
at the 5.00 or 10.00 meter distance is only a fraction of a diopter, .01 and 
.003 respectively. The clear implication to employment office practice 
is that vision tests for employee allocation should be administered as 
close to the same distance that is required for critical vision on the job 
as is practically possible. In addition, the tests should not be installed 
on an apriort basis but should be validated through the same general 
technique used in the validation of clerical, mechanical, or intelligence 
tests. ; 

2. Not only does a small difference in test distance for near vision 
greatly attenuate the relationship between tests but also the shorter 
the focal distance the lower is the general acuity. This suggests that 
jobs requiring constant critical near vision should be re-engineered to 
reduce this demand or, failing that, ocular aids should be provided since 
there is only a small percentage of individuals who possess the visual 
skills to resolve detail subtending less than a one minute visual angle. 


Summary 


1. The inter-relationship between visual acuity at various distances 
measured by the multiple choice checker design visual tests with the 
population of subjects used in this experiment is low, and the inter- 
relationship becomes steadily lower as the difference in diopters of focal 
power required between tests is increased. 

2. This population had increasingly better vision as the focal distance 
was increased. 

3. The spread of individual differences was more normally distributed 
for the near point tests than for the far point tests. The far point tests 
had a greater spread of individual differences expressed by either range 
or standard deviation. 
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4. The low relation between visual acuity at any given distance and 
visual acuity at any other distance, and the steady decrease in the 
relationship as the two distances are more widely separated, indicates 
the importance to industrial vision testing of measuring acuity at a 
distance at least approximately equivalent to the distance at which the 
employee must have satisfactory acuity. Basing the measurement of 
one’s visual acuity entirely upon his acuity at 20 feet results in frequent 
errors when the job to be performed requires acuity at a much closer 
distance. 


Received November 18, 1945. 
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Hahn, M. E., and Brayfield, A.H. Occupational laboratory manual. Pp. 
29. $1.00. Job exploration workbook. Pp. 95. $.96. Chicago: 
Science Research Associates, 1945. 


Vocational counselors who have known about the courses in occupa- 
tions at the University of Minnesota have been awaiting, for some time, 
the publication of materials which they have known must have been in 
the process of development there. It seemed certain that an institution 
doing so much in the development of tests and in the preparation of texts 
on counseling techniques would also make available the materials de- 
veloped in its occupations courses. Hahn and Bayfield have now done 
this, and counselors who have looked in vain for suitable materials can 
now have the benefit of the Minnesota experience in the teaching of oc- 
cupational information courses. 

The first-named pamphlet is the counselor’s or teacher’s Manual. 
It discusses the plans and objectives of the Occupational Laboratory, as 
the course is called, the projects constituting the course, its conduct, and 
available conuseling tools and techniques. It is emphasized that in this 
course the traditional recitation and examination procedure has no place, 
that it is, instead, a part of the counseling program and presupposes a 
competent counselor and extensive use of individual interviews: As- 
signments are tailored to the needs and interests of each individual. As 
the authors point out, many courses in occupations stress the presenta- 
tion of occupational information: this course, on the contrary, stresses 
what Kitson has long urged, namely, the teaching of techniques of study- 
ing occupations and of making initial vocational adjustments. It rec- 
ognizes that no student will retain the vast array of information covered 
by asurvey of occupations. Instead, it helps him study his own problem 
of occupational orientation and teaches him techniques he can use again 
as his problems change or become clearer. The authors’ suggestion that 
the course may, in the absence of a competent counselor, be utilized as a 
social science course is probably to be regretted, for too many of its values 
depend upon coordinated counseling. 

The authors state that the materials can be used from grades 9 
through 14, but only three of the 14 projects are actually recommended 
for use in the 9th grade. This reviewer believes that they are particu- 
larly adapted to the last year of high school and to college, and, at those 
levels, to students who are about to enter the world of work. There are 
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valuable suggestions for use in junior high school and with non-terminal 
students at other levels, but a number of additional projects would need 
to be developed for such courses. 

The brief section on counseling techniques is fortunate in its discussion 
of counseling by fields and levels, rather than in terms of specific occupa- 
tions. This fits in with the current emphasis on job families, and with 
the concept of vocational guidance and choice as a developmental process 
rather than an event. The note on personality inventories as indicators 
of possible difficulties in adjusting to a job rather than as guides in the 
choice of a vocation is in line with research on personality inventories. 

But the manual is necessarily sketchy and serves as a reminder to the 
experienced counselor rather than as instruction of the novice. Indeed, 
the novice will frequently be at a loss with these materials, for example in 

locating job descriptions for discussion in connection with Project 5. 
The bibliography has other shortcomings, also, for while it includes some 
elementary and some advanced references for counselors, it omits other 
sources of equal usefulness and different emphasis and content, for ex- 
ample Myers’ text and Clark’s monograph on Life Earnings In Selected 
Occupations. Although there is some discussion of the need for briefing 
students before they begin field work, the counselor who has never con- 
ducted such a project with high school students may, if he does not go to 
more pains than suggested by the manual, be surprised at the number 
and types of problems that arise when inexperienced young people go 
out to meet the public. 

The Workbook contains a very brief introduction for the student, and 
fourteen projects with explanations, directions, and forms. These pro- 
jects are: vocational autobiography, former student survey, job opportu- 
nity survey, survey of employment practices, study of an occupation, 
investigation of training opportunities, getting along on the job, job 
satisfaction, job campaign, preparing a personal data sheet, filling out 
application blanks, evaluating employment agencies, writing a letter of 
H application, and conducting a personal interview. Throughout, the 
emphasis is on laboratory procedures, implemented by such things as 

actual letter-writing and interviews in the last two projects, and also by a 
survey of job satisfaction rather than by a study of the relevant litera- 
ture. In fact, better use could be made of the literature on the subject 
as a means of rounding out the presentation of some topics, job satisfac- 
tion among them, especially in non-urban areas with limited community 
: resourses. 
In the introduction to the workbook, the fact that the students will 
now ask questions, rather than be asked, is probably mis-stated and over 
emphasized. Actually, the student is asked a great many questions, and 
told to find the answers. The authors might better have stated that the 
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students will be asked questions, not in order to have them recite facts 
already passed on to them by the teacher or by a book, but rather to 
guide them in asking questions the answers to which will be found in real 
life and are important to each of them personally. 

The statement that employers are often best able to help students 
draw their own conclusions concerning problems of vocational planning 
sounds too much like counseling by laymen, which is generally inadequate 
or misleading when not a part of professional counseling. It might better 
have been stated that these persons can often provide information about 
vocational opportunities and requirements which, when supplemented by 
published occupational material and by personal data obtained in ques- 
tionnaires, tests, and interviews, and synthesized and interpreted by a 
trained counselor, will be very valuable. It is novel to find the role of 
test data and of the counselor insufficiently stressed in material emanating 
from the University of Minnesota! 

A few minor suggestions might be made concerning the forms. For 
example, in the vocational autobiography the question, ‘‘What do I 
know about the occupation (of myjchoice)?”, would be more helpful if 
an outline were provided of the kinds of things the student might and 
should know. But these deficiencies are rare. On the whole, the forms 
should be very useful and in need of little if any supplementation. The 
ability of the instructor will count most in supervising their use. 

It is a pleasure to be able to say that the authors and publishers have 
made available two tools for the study of occupations which have been 
carefully prepared and thoroughly tried out, and which appear to have 
few if any serious shortcomings. As the authors point out, however, 
their usefulness will be adequately realized only when used by competent 
vocational counselors as part of a well-rounded vocational guidance 
program. 


Donald E. Super 
Teachers College, Columbia University 


Cantor, Nathaniel. Employee counseling. New York: McGraw-Hill, 
1945. Pp. viii + 167. $2.00 


A book on employee counseling is long overdue. There is considerable 
activity in this field, although the objectives and procedures of counseling 
in industry have remained quite vague. 

Cantor urges a clear recognition that the activities of industry are 
social as well as economic. ‘An industrial organization is made up of 
individuals who are concerned with the economic activity of earning their 
livelihood, and who are also engaged in living socially.” 

The first part of the book states the general problem and traces the 
development of counseling in American industry. The attempt is made 





| 





110 Book Reviews 


in the second part of the book to give the reader some insight into psy- 
chological processes involved in human relationships. In the third part 
the author discusses the organization of the counseling staff and its re- 
lationships with employees, supervisors, the union, and management. 
There is a selected bibliography on industrial counseling. 

The author points out that most industrial counseling programs have 
started since 1941. Perhaps the recency of the development accounts 
for some of the confusion regarding objectives and procedures. It is to 
be hoped that this book will clarify some of the thinking regarding the 
role of counseling in industry. . 

Industrial counselors perform one or more of these functions: Services 
that provide specific information to the employee; services that gather 

' information for the personnel department; and interviews that provide 
| employees an opportunity to express themselves. Cantor prefers to 
assign the counseling duties and the informational and service duties to 
different individuals. Such a division of duties seems desirable to the 
reviewer also because it clarifies the thinking of employees and counselors 
; regarding the nature of the counseling relationship. 
) The discussion of the organization of personality contains such ma- 
} terial as the following: “‘To be yourself, to accept your own limitations, to 
i recognize the inconsistencies in the actions of others without feeling too 
hostile, and to recognize the inconsistencies in your own behavior without 
feeling too guilty is to approach normality of mind.” According to the 
author, the recognition of the inherent ambivalence of every individual’s 
behavior is the basic psychological premise underlying the counseling 
processes. 
The counselor’s sole objective is “to help the employee get rid of or 
1 lessen the intensity of the emotional burden and so free him to do a 
H | better job.” In his description of the counseling process, the author 
reflects the influence of Rogers’ nondirective therapy in such statements 
i as“. . . the only effective way to solve an individual’s problem is for the 
) individual to face and settle his problem in his own way.”’ Similarly, the 
author described the counselor as “a person who makes it pcssible for 
the employee to talk aloud, honestly, to himself.’’ 
Some of the examples of counseling presented in the book show that 
the counselors were amateurs and were unfamiliar with the principles 
outlined in this book. 
With thousands of veterans being reabsorbed into industry, the need 
for counseling will increase. In the reviewer’s mind there is still a 
. question regarding the responsibility of industry for the psychotherapy of 
ii employees. The program outlined in this book is primarily applicable 
to a large industrial plant, whereas the average factory is too small to 
support a well-developed counseling program. It will be necessary for 














Book Reviews 111 


community agencies to provide the professional counseling which severe 
cases need. 

This book is valuable inasmuch as it helps clarify the objectives and 
procedures of employee counseling. Someone will write a much better 
book in a few years, however, after thinking has been clarified through 
more experience and reflection in this field. 


Charles C. Gibbons 
W. E. Upjohn Institute 


For Community Research 
Kalamazoo, Michigan 


Dicks, Russell. Pastoral work and personal counseling. New York: 
MacMillan, 1944. Pp. x + 227. $2.00. 


Expanding from his previous work and writing in the field of mini- 
sterial work for the sick, the author now describes the whole of the pas- 
toral task of protestant clergyman, with special reference to their inade- 
quately understood counseling service. He declares the clergy is as 
poorly prepared for this increasingly important function as were physi- 
cians when internship began. Like pyhsicians, counselors should strive 
“to cure sometimes, to relieve often, and to comfort always.” 

Chaplains claim that seventy five per cent of their time is given to 
work with individuals, their problem being to find time for the tremend- 
ous volume of possible interviews, eight to ten per day being about all a 
minister can handle effectively. 

Pastoral calling is a fruitful procedure for initiating personal counsel- 
ing, but again some successful preachers claim they are “‘too busy” for 
this. The author emphatically condemns such oversight of opportunity 
as a “defence for ignorance on the one hand or a lack of faith upon the 
other,” declaring that every minister should average four calls per day. 

Although declaring that advising is “dictatorship in living” the author 
throughout the book glibly gives considerable advice to his fellow mini- 
sters ;—all of it so good one could wish it were accepted. 

Part IT deals in detail with the opportunities for personal counseling 
of the sick and dying, bereaved, unemployed, imprisoned, aged and 
shut-ins, new and prospective church members. “Ninety per cent of 
our marital counseling originates with the wife,’ declares the author, 
which may indicate the feminization of church work,—as well as the more 
obvious conclusion. He calls attention to the counseling axiom that no 
real help can be given until a counselee suffers sufficiently to want help. 
Pre-marital and marital counseling is rightly emphasized as appropriate 
work for ministers, because they perform the marriage ceremony and also 
because of the church’s historic dicta regarding family morals. Mis- 
marriage is given as a prevalant cause for drunkeness. 
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_ Selecting a spouse and selecting a life-work are mentioned as the two 
greatest decisions that a person makes, yet very little space is given to 
discussing the latter. Perhaps the author’s general advice applies here, 
“do not waste time and run risks so far as your reputation is concerned in 
dealing with something you know nothing about.”’ The author’s ex- 
periences in war-time counseling under the Y.M.C.A.—U.S.0., give the 
basis for some new emphases. 

Conditions of effective pastoral work and counseling are ably con- 
sidered in Part III and should give all clergymen readers clearer insight 
into the psychological laws governing this work. His philosophical 
statement that creation emanated from suffering is followed by discussion 
of the four types of suffering; pain, fear, guilt feelings, and loneliness. 
“Our reasoning powers are the first to break under the pressure of pro- 
longed suffering,” he declares and ‘‘unless viewed from the perspective of 
time and the larger experience of living, pain is destructive.” 

Two chapters are devoted to “Listening,’”—presumably a difficult 
technique for those who by predeliction and training are preachers. He 
points out that the confessional of the liturgical churches is limited by 
canon law; e.g. lying is a sin theologically but a revealing defense mech- 
anism psychologically. 

Pastors’ failure to keep records is castigated as lack of requisite dis- 
cipline for professional standing as counselors. Preaching is said to be 
rapidly declining as the principal method of carrying on the church 
work,—but the author outlines a splendid sermon on the counseling 
function and procedure of a pastor. The relation of the clergyman to 
other professional workers is helpfully described. While appreciating 
all of them, the author exalts the pastor as he whose “‘task is to personalize 
the man on the cross.” 

Pastors will like this helpful book, other counselors will profit by its 
religious approach. 


J. Gustav White 
California State Vocational Rehabilitation Bureau 


Hudson, Holland, and Fish, Marjorie. Occupational therapy in the treat- 
ment of the tuberculous patient. New York: National Tuberculosis 
Association, 1944. Pp xii +317. $3.00 
The authors originally planned to write a text and reference book for 

undergraduate students of occupational therapy. They have done that 

and more. They have contributed a volume which is written in a re- 
freshing and engaging style and which should be read not only by occupa- 
tional therapy students planning to work in tuberculosis hospitals, but by 
all practitioners and students of occupational therapy and by mature 
persons considering occupational therapy as a career. This book can 
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also be read with profit by other professional workers in tuberculosis 
hospitals, by vocational counselors, and by training and placement 
workers dealing with rehabilitation problems. 

The authors, in discussing the selection of books for patients, state 
that “sometimes more may be learned from a text which has been pre- 
pared by a highly competent teacher than from a year’s tutelage by a 
mediocre instructor.’’ Certainly much can be learned from this book 
since the authors bring to its writing the service and experience which has 
placed one of them as the Director of Rehabilitation Service of the Na- 
tional Tuberculosis Association and the other as Director of Professional 
Courses in Occupational Therapy at Columbia University. Every page 
reflects the authors’ intimate knowledge of tuberculosis, its nature, diag- 
nosis and treatment; of the tuberculous patient and the tuberculosis 
hospital, as well as the role, training, and techniques of the occupational 
therapist in the total program. Besides giving the student a generous 
insight into their philosophy of therapy, the authors are realistically 
helpful in providing numerous practical suggestions for such specific 
services as library service, musical therapy, graphic and plastic arts, 
woodworking, household and homemaking arts, prevocational and voca- 
tional training, and placement. 

Stress is placed on the treatment of the whole patient, not merely 
treatment of the clinical disease. The student is warned against pro- 
grams in which “the patient remains an abstraction quite as if the bacillus 
led an existence independent of its host.’ 

The point is also made that the treatment of tuberculosis is not a 
“solo performance”. While the role of the occupational therapist is an 
important one, she cannot operate effectively except under a physician’s 
direction and with the aid of the medical social worker, nurse and those 
other hospital workers whose related roles are well described. The 
authors deplore the fact that so few tuberculosis hospitals have the as- 
sistance of experts in mental hygiene. It is significant that, in a book 
which stresses the need for attention to the whole person, no mention is 
made of the psychologist or vocational counselor. This is due, no doubt, 
less to an oversight on the part of the authors than to a lack of understand- 
ing on the part of psychologists and hospital administrators of the role 
which the psychologist might play in this special type of dynamic and 
individual case treatment. If psychologists and counselors are not a- 
vailable to the hospital staff, some of their functions will be performed by 
the occupational therapist who must act also as a medical social worker 
in the absence of a professionally qualified person in that field. A major 
responsibility of the occupational therapist, say the authors, is first to 
study the patient and then to develop a project to fit the patient whom 
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she has studied, and it is in the development of practice based on this 
sound philosophy that this stimulating volume makes its contribution. 


Gwendolen G. Schneidler 
Veterans Administration, 
Advisement and Guidance Service, 
Washington, D. C. 


Bills, Arthur Gilbert. The psychology of efficiency. A discussion of the 
hygiene of mental work. New York: Harper & Bros., 1943. Pp. xiv 
+ 361. $2.75. 


As the author points out in his preface, ‘“Books on how to avoid worry 
and overcome fear, how to prevent and remove emotional conflicts and 
maladjustments, are plentiful, as are treatises on how to avoid friction in 
dealing with others and how to improve our personalities . . . . Yet an 
almost complete dearth of books exists on the subject of mental effici- 
ency; i.e., how the average, normal, well-adjusted person, geared to a 
daily program of work, can manage to get the most efficient service from 
his own mental equipment.” To supply this lack, the author here brings 
together the principal results of a wide variety of experimental studies on 
mental and motor efficiency. In contrast to several books written for 
industrial engineers and supervisors, the author’s point of view is stated 
as “that of the mental worker himself who would like to know how to ac- 
complish the most with the least wear and tear and the greatest long- 
time satisfaction to himself.” 

The introductory chapter, entitled “The Thinking Machine,” outlines 
the principle that the entire organism, not isolated segments, performs 
every piece of mental and physical work; ‘‘At-tention is in large part 
bodily tension.” It gives also a preview of those aspects of work-hygiene 
discussed in the following chapters; and a brief resumé of these topics 
will serve to make clear the general nature of the book: Controlling the 
energy level; Mental work, fatigue, rest, recovery, 5 chapters; (fatigue is 
defined as any reduction of efficiency); Sleep, its nature and control; 
Factors in the work setting, attention and distractions, motives and in- 
centives, emotions, suggestion, physical conditions, 5 chapters; Modifica- 
tion of efficiency by learning, age changes, personal organization and 
planning, effective thinking, 4 chapters. 

The level of treatment falls somewhere between that of popular pre- 
sentation and that of detailed and critical exposition of research. More 
space is devoted to results than to apparatus and methods, although 
there is enough about the latter to make the research findings intelligible, 
if one seeks only to understand and not to repeat or criticize the studies 
described. The fact that the discussions are not crowded with citations 
of sources and that the 168 references mentioned represent only a minor 
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fraction of the work done, should not bother those readers who are them- 
selves oriented to the experimental literature, and who bear in mind the 
author’s purposes; and it certainly makes for smoother and easier reading. 

The inclusion of supplementary reading references, of 17 pages of 
“Test Items for Review” (true-false, completion, best-answer), and of a 
15-page glossary, suggests that the author had primarily in mind class- 
room readers, and only secondarily those engaged in industrial super- 
vision and planning; although for these latter also it should prove a use- 
ful guide, since practical implications of the principles have been stressed 
throughout. The book is written in a clear, direct, and readable style, 
with frequent summaries. At its level of presentation it should prove a 
valuable addition to the libraries of both students and professional 
workers in this important but hitherto neglected field. 


Forrest A. Kingsbury 
The University of Chicago 











New Books, Monographs, and Pamphlets 


Books, monographs, and pamphlets for listing and possible review should be 
sent to Donald G. Paterson, Editor, Department of Psychology, 
University of Minnesota, Minneapolis 14, Minnesota 
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